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IMPROVED ADENOVIRUS AND METHODS OF USE THEREOF 



This invention was supported by the National 
Institute of Health Grant No. P30 DK 47757. The United 
5 States government has rights in this invention. 

Field of the Invention 

The present invention relates to the field of 
vectors useful in somatic gene therapy and the production 
10 thereof. 



Background of the Invention 

Human gene therapy is an approach to treating human 
disease that is based on the modification of gene 

15 expression in cells of the patient. It has become 
apparent over the last decade that the single most 
outstanding barrier to the success of gene therapy as a 
strategy for treating inherited diseases, cancer, and 
other genetic dysfunctions is the development of useful 

20 gene transfer vehicles. Eukaryotic viruses have been 

employed as vehicles for somatic gene therapy. Among the 
viral vectors that have been cited frequently in gene 
therapy research are adenoviruses. 

Adenoviruses are eukaryotic DNA viruses that can be 

25 modified to efficiently deliver a therapeutic or reporter 
transgene to a variety of cell types. Recombinant 
adenoviruses types 2 and 5 (Ad 2 and Ad5, respectively) , 
which cause respiratory disease in humans, are currently 
being developed for gene therapy. Both Ad2 and Ad5 

30 belong to a subclass of adenovirus that are not 
associated with human malignancies. Recombinant 
adenoviruses are capable of providing extremely high 
levels of transgene delivery to virtually all cell types, 
regardless of the mitotic state. High tit rs (10 13 

35 plagu forming units/ml) f recombinant virus can be 

easily generat d in 293 cells (the adenovirus eguival nt 
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to retrovirus packaging cell lines) and cryo-stored for 
extended periods without appreciable losses. The 
efficacy of this system in delivering a therapeutic 
transgene in vivo that complements a genetic imbalance 
5 has been demonstrated in anima- models of various 
disorders [Y. Watanabe, ftth? 1 " 0801 ^ 0818 - 16*261-268 
(1986); K. Tanzawa et al, ff.rs Letters. H£(l) :81-84 
(1980) ; J.L. Golasten et al, Hew Engl. J. Medi , 
229.(11983) : 288-296 (1983); S. Ishibashi et al, J, Clint 
10 invest. . 22:883-893 (1993); and S. Ishibashi et al, i. 
nin. invest. . 13:1885-1893 (1994)]. Indeed, a 
recombinant replication defective adenovirus encoding a 
cDNA for the cystic fibrosis transmembrane regulator 
(CFTR) has been approved for use in at least two human cr 
15 clinical trials [see, e.g., J. Wilson, Maims, 2fi5.:691- 
692 (Oct. 21, 1993)]. Further support of the safety of 
recombinant adenoviruses for gene therapy is the 
extensive experience of live adenovirus vaccines in human 
populations . 

20 Human adenoviruses are comprised of a linear, 

approximately 36 kb double-stranded DNA genome, which is 
divided into 100 map units (m.u.), each of which is 360 
bp in length. The DNA contains short inverted terminal 
repeats (ITR) at each end of the genome that are required 

25 for viral DNA replication. The gene products are 

organized into early (El through E4) and late (LI through 
L5) regions, based on expression before or after the 
initiation of viral DNA synthesis [see, e.g., Horwitz, 
virology . 2d edit., ed. B. N. Fields, Raven Press, Ltd. , 

30 New York (1990)]. 

The first-generation recombinant, replication- 
deficient adenoviruses which have been developed for gene 
therapy contain d 1 tions of the entire Ela and part of 
the Elb r gions. This replicati n-defective virus is 

35 grown on an adenovirus-transf ormed, complementation human 
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embryonic kidney cell line containing a functional 
adenovirus Ela gene which provides a transacting Ela 
protein, the 293 cell [ATCC CRL1573]. El-deleted viruses 
are capable of replicating and producing infectious virus 
5 in the 293 cells, which previa^ Ela and Elb region gene 
products in trans. The resulting virus is capable of 
infecting many cell types and can express the introduced 
gene (providing it carries its own promoter) , but cannot 
replicate in a cell that does not carry the El region DNA 

10 unless the cell is infected at a very high multiplicity 
of infection. 

However, in vivo studies revealed transgene 
expression in these El deleted vectors was transient and 
invariably associated with the development of severe 

15 inflammation at the site of vector targeting [S. 

Ishibashi et al, J» Clin, Invest ,, 21:1885-1893 (1994); 
J. M. Wilson et al r Proc. Natl- Acad. Sci.. USA. ££:4421- 
4424 (1988); J. M. Wilson et al, Clin. Bio. , 2:21-26 
(1991); M. Grossman et al, Som. Cell, and M ol- Gen., 

20 17:601-607 (1991)]. One explanation that has been 

proposed to explain this finding is that first generation 
recombinant adenoviruses, despite the deletion of El 
genes, express low levels of other viral proteins. This 
could be due to basal expression from the unstimulated 

25 viral promoters or transactivation by cellular factors. 
Expression of viral proteins leads to cellular immune 
responses to the genetically modified cells, resulting in 
their destruction and replacement with nontransgene 
containing cells. 

30 There yet remains a need in the art for the 

development of additional adenovirus vector constructs 
for gene therapy. 



35 



WO 96/13597 



PCTAJS95/14017 



ffllMlfirY " f Tnvantion 

In one aspect, the invention provides the components 
of a novel recombinant adenovirus production system. One 
component is a shuttle plasmid, pAdA, that comprises 
5 adenovirus cis-elements necessary for replication and 
virion encapsidation and is deleted of all viral genes. 
This vector carries a selected transgene under the 
control of a selected promoter and other conventional 
vector /plasmid regulatory components. The other 
10 component is a helper adenovirus, which alone or with a 
packaging cell line, supplies sufficient gene sequences 
necessary for a productive viral infection. In a 
preferred embodiment, the helper virus has been altered 
to contain modifications to the native gene sequences 
15 which direct efficient packaging, so as to substantially 
disable or "cripple" the packaging function of the helper 
virus or its ability to replicate. 

In another aspect, the present invention provides a 
unique recombinant adenovirus, an AdA virus, produced by 
20 use of the components above. This recombinant virus 

comprises an adenovirus capsid, adenovirus cis-elements 
necessary for replication and virion encapsidation, but 
is deleted of all viral genes (i.e., all viral open 
reading frames). This virus particle carries a selected 
25 transgene under the control of a selected promoter and 
other conventional vector regulatory components. This 
AdA recombinant virus is characterized by high titer 
transgene delivery to a host Cell and the ability to 
stably integrate the transgene into the host cell 
30 chromosome. In one embodiment, the virus carries as its 
transgene a reporter gene. Another embodiment of the 
recombinant virus contains a therapeutic transgene. 

In another aspect, the invention provides a method 
for producing the abov -d scribed recombinant AdA virus 
35 by co-transfecting a c 11 line (eith r a packaging cell 
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line or a non-packaging cell line) with a shuttle vector 
or plasmid and a helper adenovirus as described above, 
wherein the transfected cell generates the AdA virus. 
The AdA virus is subsequently isolated and purified 
5 therefrom. 

In yet a further aspect, the invention provides a 
method for delivering a selected gene to a host cell for 
expression in that cell by administering an effective 
amount of a recombinant AdA virus containing a 
10 therapeutic transgene to a patient to treat or correct a 
genetically associated disorder or disease. 

Other aspects and advantages of the present 
invention are described further in the following detailed 
description of the preferred embodiments thereof. 

15 

Brief Description of the Figures 

Fig. 1A is a schematic representation of the 

organization of the major functional elements that define 

the 5* terminus from Ad5 including an inverted terminal 
20 repeat (ITR) and a packaging/enhancer domain. The TATA 

box of the El promoter (black box) and E1A 

transcriptional start site (arrow) are also shown. 
Fig. IB is an expanded schematic of the 

packaging/ enhancer region of Fig. 1A, indicating the five 
25 packaging (PAC) domains (A-repeats) , I through V. The 

arrows indicate the location of PCR primers referenced in 

Figs. 9A and 9B below. 

Fig. 2 A is a schematic Of shuttle vector 

pAdA.CHVLacZ containing 5 9 ITR from Ad5, followed by a 
30 CMV promoter /enhancer, a LacZ gene, a 3' ITR from Ad5, 

and remaining plasmid sequence from plasmid pSP72 

backbone. Restriction endonuclease enzymes are 

represented by conv ntional designations in the plasmid 

constructs • 
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Pig. 2B is a schematic of the shuttle vector 
digested with EcoRI to release the modified AdA genome 
from the pSP72 plasmid backbone. 

Fig. 2C is a schematic depiction of the function of 
5 the vector system. In the presence of an El-deleted 

helper virus Ad.CBhpAP which encodes a reporter minigene 
for human placenta alkaline phosphatase (hpAP) , the 
AdA . CMVLacZ genome is packaged into preformed virion 
capsids, distinguishable from the helper virions by the 
10 presence of the LacZ gene. 

Figs. 3A to 3F [SEQ ID NO: 1] report the top DNA 
strand of the double-stranded plasmid pAdA . CMVLacZ . The 
complementary sequence may be readily obtained by one of 
skill in the art. The sequence includes the following 
15 components: 3* Ad ITR (nucleotides 607-28 of SEQ ID NO: 
1); the 5 » Ad ITR (nucleotides 5496-5144 of SEQ ID NO: 
1) ; CMV promoter/enhancer (nucleotides 5117-4524 of SEQ 
ID NO: 1) ; SD/SA sequence (nucleotides 4507-4376 of SEQ 
ID NO: 1) ; LacZ gene (nucleotides 4320-845 of SEQ ID NO: 
20 1) ; and a poly A sequence (nucleotides 837-639 of SEQ ID 
NO: 1). 

Fig. 4A is a schematic of shuttle vector 
pAdAc. CMVLacZ containing an Ad5 5* ITR and 3 ' ITR 
positioned head-to-tail, with a CMV enhancer /promoter- 

25 LacZ minigene immediately following the 5» ITR, followed 
by a plasmid pSP72 (Promega) backbone. Restriction 
endonuclease enzymes are represented by conventional 
designations in the plasmid constructs. 

Fig. 4B is a schematic depiction of the function of 

30 the vector system of Fig. 4A. In the presence of helper 
virus Ad.CBhpAP, the circular pADAc. CMVLacZ shuttle 
vector sequence is packaged into virion heads, 
distinguishable from the h lper virions by the presenc 
of the LacZ gen . 

35 
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Figs. 5A to 5F [SEQ ID NO: 2] report the top DNA 
strand of the double-stranded vector pAdAc.GMVLacZ. The 
complementary sequence may be readily obtained by one of 
skill in the art* The sequence includes the following 
5 components: 5 9 Ad ITR (nucleotides 600*958 of SEQ ID NO: 
2); GMV promoter /enhancer (nucleotides 969-1563 of SEQ ID 
NO: 2); SD/SA sequence (nucleotides 1579-1711); LacZ gene 
(nucleotides 1762-5236 of SEQ ID NO: 2) ; poly A sequence 
(nucleotides 5245-5443 of SEQ ID NO: 2); and 3» Ad ITR 

10 (nucleotides 16-596 of SEQ ID NO: 2) . 

Fig. 6 is a schematic of shuttle vector pAdA . CBCFTR 
containing 5 • ITR from Ad5, followed by a chimeric CMV 
enhancer/ B act in promoter enhancer, a CFTR gene, a poly-A 
sequence, a 3' ITR from Ad5, and remaining plasmid 

15 sequence from plasmid pSL1180 (Pharmacia) backbone. 
Restriction endonuclease enzymes are represented by 
conventional designations in the plasmid constructs. 

Figs. 7A to 7H [SEQ ID NO: 3] report the top DNA 
strand of the double-stranded plasmid pAdA. CBCFTR. The 

20 complementary sequence may be readily obtained by one of 
skill in the art. The sequence includes the following 
components: 5' Ad ITR (nucleotides 9611-9254 of SEQ ID 
NO: 3); chimeric CMV enhancer /B act in promoter 
(nucleotides 9241-8684 of SEQ ID NO: 3); CFTR gene 

25 (nucleotides 8622-4065 of SEQ ID NO: 3); poly A sequence 
(nucleotides 3887-3684 of SEQ ID NO: 3); and 3 9 Ad ITR 
(nucleotides 3652-3073 of SEQ ID NO: 3). The remaining 
plasmid backbone is obtained from pSL1180 (Pharmacia) . 

Fig. 8A illustrates the generation of 5' adenovirus 

30 terminal sequence that contained PAC domains I and II by 
PCR. See, arrows indicating righthand and lefthand (PAC 
II) PCR probes in Fig. IB. 
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Fig. 8B illustrates the generation of 5' terminal 
sequence that contained PAC domains I, II, III and IV by 
PCR. See, arrows indicating righthand and lefthand (PAC 
IV) PCR probes in Fig. IB. 
5 Fig. 8C depicts the amplification products subcloned 

into the multiple cloning site of pAd.Link.l (IHGT Vector 
Core) generating pAd.PACII (domains I and II) and 
pAd.PACIV (domains I, II, III, and IV) resulting in 
crippled helper viruses, Ad.PACII and Ad.PACIV with 
10 modified packaging (PAC) signals. 

Fig. 9A is a schematic representation of the 
subcloning of a human placenta alkaline phosphatase 
reporter minigene containing the immediate early CHV 
enhancer/ promoter (CMV) , human placenta alkaline 
15 phosphatase cDNA (hpAP) , and SV40 polyadenylation signal 
(pA) into pAd.PACII to result in crippled helper virus 
vector pAdA.PACII.CMVhpAP. Restriction endonuclease 
enzymes are represented by conventional designations in 
the plasmid constructs. 
20 Fig. 9B is a schematic representation of the 

subcloning of the same minigene of Fig. 9A into pAd.PACIV 
to result in crippled helper virus vector 
pAd . PACIV . CMV . hpAP . 

Fig. 10 is a flow diagram summarizing the synthesis 
25 of an adenovirus-based polycation helper virus conjugate 
and its combination with a pAdA shuttle vector to result 
in a novel viral particle complex. CsCl band purified 
helper adenovirus was reacted with the heter obi functional 
crosslinker sulfo-SMCC and the capsid protein fiber is 
30 labeled with the nucleophilic maleimide moiety. Free 
sulfhydryls were introduced onto poly-L- lysine using 2- 
iminothiolane-HCl and mixed with the labelled adenovirus, 
resulting in the helper virus conjugate Ad-pLys. A 
uniqu adenovirus-bas d particle is generated by 
35 purifying the Ad-pLys conjugat over a CsCl gradient to 
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remove unincorporated poly-L- lysine, followed by 
extensively dialyzing, adding shuttle plasmid DNAs to Ad- 
pLys and allowing the complex formed by the shuttle 
plasmid wrapped around Ad-pLys to develop. 
5 Fig. 11 is a schematic diagram of pCCL-DMD, which is 

described in detail in Example 9 below. 

Fig. 12A - 12P provides the continuous DNA sequence 
of pAdA • CMVmDys [SEQ ID NO: 10]. 

io Details Pesgriptign <?t th? Invention 

The present invention provides a unique recombinant 
adenovirus capable of delivering transgenes to target 
cells, as well as the components for production of the 
unique virus and methods for the use of the virus to 

15 treat a variety of genetic disorders. 

The AdA virus of this invention is a viral particle 
containing only the adenovirus cis-elements necessary for 
replication and virion encapsidation (i.e., ITRs and 
packaging sequences) , but otherwise deleted of all 

20 adenovirus genes (i.e., all viral open reading frames). 

This virus carries a selected transgene under the control 
of a selected promoter and other conventional regulatory 
components, such as a poly A signal. The AdA virus is 
characterized by improved persistence of the vector DNA 

25 in the host cells, reduced antigenicity/ immunogenic ity, 
and hence, improved performance as a delivery vehicle. 
An additional advantage of this invention is that the AdA 
virus permits the packaging of very large transgenes, 
such as a full-length dystrophin cDNA for the treatment 

30 of the progressive wasting of muscle tissue 

characteristic of Duchenne Muscular Dystrophy (DMD) . 

This novel recombinant virus is produced by use of 
an ad novirus-based vector production system containing 
two comp nents: 1) a shuttle v ct r that compris s 

35 aden virus cis-elements nec ssary f r r plication and 
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virion encapsidation and is deleted of all viral genes, 
which vector carries a reporter or therapeutic minigene 
and 2) a helper adenovirus which, alone or with a 
packaging cell line, is capable of providing all of the 
5 viral gene products necessary ior a productive viral 
infection when co-trans fected with the shuttle vector. 
Preferably, the helper virus is modified so that it does 
not package itself efficiently. In this setting, it is 
desirably used in combination with a packaging cell line 
10 that stably expresses adenovirus genes. The methods of 

producing this viral vector from these components include 
both a novel means of packaging of an 

adenoviral/ transgene containing vector into a virus, and 
a novel method for the subsequent separation of the 
15 helper virus from the newly formed recombinant virus. 

Z. The Shuttle Vector 

The shuttle vector, referred to as pAdA, is composed 
of adenovirus sequences, and transgene sequences, 
20 including vector regulatory control sequences. 
The AdenoYir'" Sequences 

The adenovirus nucleic acid sequences of the 
shuttle vector provide the minimum adenovirus sequences 
which enable a viral particle to be produced with the 

25 assistance of a helper virus. These sequences assist in 
delivery of a recombinant transgene genome to a target 
cell by the resulting recombinant virus. 

The DNA sequences of a number of adenovirus 
types are available from Genbank, including type Ad5 

30 [Genbank Accession No. M73260). The adenovirus sequences 
may be obtained from any known adenovirus serotype, such 
as serotypes 2, 3, 4, 7, 12 and 40, and further including 
any of the pr sently identified 41 human types [s , 
e.g., Horwitz, cit d above]. Similarly adenovirus s 

35 known to infect oth r animals may also be employ d in the 
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vector constructs of this Invention. The selection of 
the adenovirus type is not anticipated to limit the 
following invention. A variety of adenovirus strains are 
available from the American Type Culture Collection, 
5 Rockvi lie, Maryland, or available by request from a 

variety of commercial and institutional sources. In the 
following exemplary embodiment an adenovirus, type 5 
(Ad5) is used for convenience. 

However, it is desirable to obtain a variety of 

10 pAdA shuttle vectors based on different human adenovirus 
serotypes. It is anticipated that a library of such 
plasmids and the resulting AdA viral vectors would be 
useful in a therapeutic regimen to evade cellular, and 
possibly humoral, immunity, and lengthen the duration of 

15 transgene expression, as well as improve the success of 
repeat therapeutic treatments. Additionally the use of 
various serotypes is believed to produce recombinant 
viruses with different tissue targeting specificities. 
The absence of adenoviral genes in the AdA viral vector 

20 is anticipated to reduce or eliminate adverse CTL 

response which normally causes destruction of recombinant 
adenoviruses deleted of only the £1 gene. 

Specifically, the adenovirus nucleic acid 
sequences employed in the pAdA shuttle vector of this 

25 invention are adenovirus genomic sequences from which all 
viral genes are deleted. More specifically, the 
adenovirus sequences employed are the cis-acting 5 V and 
3 1 inverted terminal repeat (ITR) sequences of an 
adenovirus (which function as origins of replication) and 

30 the native 5' packaging/enhancer domain, that contains 
sequences necessary for packaging linear Ad genomes and 
enhancer elements for the El promoter. These sequences 
are th sequ nces necessary for replication and virion 
encapsidation. S e, .g., P. Hearing et al, J . Virol . . 

35 £1(8) :2555-2558 (1987); M. Grable and P. Hearing, 
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Virol. . 6_±(5): 2047-2056 (1990); and M. Grable and P. 
Hearing, J. Virol. . £6.(2) :723-731 (1992). 

According to this invention, the entire 
adenovirus 5* sequence containing the 5* ITR and 
packaging/enhancer region can be employed as the 5' 
adenovirus sequence in the pAdA shuttle vector. This 
left terminal (5') sequence of the Ad5 genome useful in 
this invention spans bp 1 to about 360 of the 
conventional adenovirus genome, also referred to as map 
units 0-1 of the viral genome. This sequence is provided 
herein as nucleotides 5496-5144 of SEQ ID NO: 1, 
nucleotides 600-958 of SEQ ID NO: 2; and nucleotides 
9611-9254 of SEQ ID NO: 3, and generally is from about 
353 to about 360 nucleotides in length. This sequence 
15 includes the 5' ITR (bp 1-103 of the adenovirus genome), 
and the packaging/enhancer domain (bp 194-358 of the 
adenovirus genome). See, Figs. 1A, 3, 5, and 7. 

Preferably, this native adenovirus 5" region is 
employed in the shuttle vector in unmodified form. 
20 However, some modifications including deletions, 

substitutions and additions to this sequence which do not 
adversely effect its biological function may be 
acceptable. See, e.g., WO 93/24641, published December 
9, 1993. The ability to modify these ITR sequences is 
25 within the ability of one of skill in the art. See, 

e.g., texts such as Sambrook et al, "Molecular Cloning. 
A Laboratory Manual.-, 2d edit., Cold Spring Harbor 
Laboratory, Cold Spring Harbor, New York (1989) . 

The 3* adenovirus sequences of the shuttle 
vector include the right terminal (3») ITR sequence of 
the adenoviral genome spanning about bp 35,353 - end of 
the adenovirus genome, or map units "98.4-100. This 
s qu nee is pr vided her in as nucleotides 607-28 of SEQ 
ID NO: 1, nucleotides 16-596 of SEQ ID NO: 2; and 
35 nucl otid s 3652-3073 of SEQ ID NO: 3, and g n rally is 
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about 580 nucleotides in length. This entire sequence is 
desirably employed as the 3 9 sequence of an pAdA shuttle 
vector. Preferably, the native adenovirus 3 1 region is 
employed in the shuttle vector in unmodified form. 
5 However, some modifications to this sequence which do not 
adversely effect its biological function may be 
acceptable. 

An exemplary pAdA shuttle vector of this 
invention, described below and in Fig. 2A, contains only 

10 those adenovirus sequences required for packaging 

adenoviral genomic DNA into a preformed capsid head. The 
pAdA vector contains Ad5 sequences encoding the 5* 
terminal and 3' terminal sequences (identified in the 
description of Fig. 3) , as well as the transgene 

15 sequences described below. 

From the foregoing information, it is expected 
that one of skill in the art may employ other equivalent 
adenovirus sequences for use in the AdA vectors of this 
invention. These sequences may include other adenovirus 

20 strains, or the above mentioned cis-acting sequences with 
minor modifications. 

fij. The Tr anggen? 

The transgene sequence of the vector and 
recombinant virus is a nucleic acid sequence or reverse 
25 transcript thereof, heterologous to the adenovirus 
sequence, which encodes a polypeptide or protein of 
interest. The transgene is operatively linked to 
regulatory components in a manner which permits transgene 
transcription • 

30 The composition of the transgene sequence will 

depend upon the use to which the resulting virus will be 
put. For example, one type of transgene sequence 
includes a rep rter sequence, which upon expression 
produc sad tectable signal. Such r porter sequ nces 

35 includ without limitation an E. coli beta-galactosidas 
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(LacZ) cDNA, a human placental alkaline phosphatase gene 
and a green fluorescent protein gene. These sequences, 
when associated with regulatory elements which drive 
their expression, provide signals detectable by 
5 conventional means, e.g., ultra /iolet wavelength 
absorbance, visible color change, etc. 

Another type of transgene sequence includes a 
therapeutic gene which expresses a desired gene product 
in a host cell. These therapeutic nucleic acid sequences 
10 typically encode products for administration and 

expression in a patient in vivo or ex vivo to replace or 
correct an inherited or non-inherited genetic defect or 
treat an epigenetic disorder or disease. Such 
therapeutic genes which are desirable for the performance 
15 of gene therapy include, without limitation, a normal 

cystic fibrosis transmembrane regulator (CFTR) gene (see 
Fig. 7) , a low density lipoprotein (LDL) gene [T. 
Yamamoto et al, £fill, 22:27-28 (November, 1984)], a DMD 
cDNA sequence [partial sequences available from GenBank, 
20 Accession Nos. M36673, M36671, [A. P. Monaco et al, 

Nature , 122:646-650 (1986)] and L06900, [Roberts et al. 
Hum. Mutat, . 2:293-299 (1993)]] (Genbank) , and a number 
of genes which may be readily selected by one of skill in 
the art. The selection of the transgene is not 
25 considered to be a limitation of this invention, as such 
selection is within the knowledge of the art-skilled. 
£j. Regulatory Elements 

In addition to the major elements identified 
above for the pAdA shuttle vector, i.e., the adenovirus 
30 sequences and the transgene, the vector also includes 
conventional regulatory elements necessary to drive 
expression of the transgene in a cell transfected with 
the pAdA vector. Thus th vector contains a s lect d 
prom ter which is linked to the transg ne and located, 
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with the transgene, between the adenovirus sequences of 
the vector. 

Selection of the promoter is a routine matter 
and is not a limitation of the pAdA vector itself. 
5 Useful promoters may be const i cut ive promoters or 
regulated (inducible) promoters, which will enable 
control of the amount of the transgene to be expressed. 
For example, a desirable promoter is that of the 
cytomegalovirus immediate early promoter/ enhancer [see, 

10 e.g., Boshart et al, SsH, 41:521-530 (1985)]. This 

promoter is found at nucleotides 5117-4524 of SEQ ID NO: 
1 and nucleotides 969-1563 of SEQ ID NO: 2. Another 
promoter is the CMV enhancer/chicken B-actin promoter 
(nucleotides 9241-8684 of SEQ ID NO: 3). Another 

15 desirable promoter includes, without limitation, the Rous 
sarcoma virus LTR promoter /enhancer. Still other 
promoter/ enhancer sequences may be selected by one of 
skill in the art. 

The shuttle vectors will also desirably contain 

20 nucleic acid sequences heterologous to the adenovirus 

sequences including sequences providing signals required 
for efficient polyadenylation of the transcript and 
introns with functional splice donor and acceptor sites 
(SD/SA) . A common poly-A sequence which is employed in 

25 the exemplary vectors of this invention is that derived 
from the papovavirus SV-40 [see, e.g., nucleotides 837- 
639 Of SEQ ID NO: 1; 5245-5443 of SEQ ID NO: 2; and 3887- 
3684 of SEQ ID NO: 3]. The pbly-A sequence generally is 
inserted in the vector following the transgene sequences 

30 and before the 3 V adenovirus sequences. A common intron 
sequence is also derived from SV-40, and is referred to 
as the SV-40 T intron sequence [see, e.g., nucleotides 
4507-4376 of SEQ ID NO: 1 and 1579-1711 f SEQ ID NO: 2]. 
A pAdA shuttle vector f th pr s nt inventi n may also 

35 c ntain such an intron, desirably locat d betwe n the 
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promoter /enhancer sequence and the transgene. Selection 
of these and other common vector elements are 
conventional and many such sequences are available [see, 
e.g., Sambrook et al, and references cited therein]. 

5 Examples of such regulatory sequences for the above are 
provided in the plasmid sequences of Figs. 3, 5 and 7. 

The combination of the transgene, promoter/ 
enhancer, the other regulatory vector elements are 
referred to as a "minigene" for ease of reference herein. 

10 The minigene is preferably flanked by the 5' and 3» cis- 
acting adenovirus sequences described above. Such a 
minigene may have a size in the range of several hundred 
base pairs up to about 30 kb due to the absence of 
adenovirus early and late gene sequences in the vector. 

15 Thus, this AdA vector system permits a great deal of 

latitude in the selection of the various components of 
the minigene, particularly the selected transgene, with 
regard to size. Provided with the teachings of this 
invention, the design of such a minigene can be made by 

20 resort to conventional techniques. 

XI. The Helper virus 

Because of the limited amount of adenovirus sequence 
present in the AdA shuttle vector, a helper adenovirus of 

25 this invention must, alone or in concert with a packaging 
cell line, provide sufficient adenovirus gene sequences 
necessary for a productive viral infection. Helper 
viruses useful in this invention thus contain selected 
adenovirus gene sequences, and optionally a second 

30 reporter minigene. 

Normally, the production of a recombinant adenovirus 
which utilizes helper adenovirus containing a full 
complement of ad noviral gen s r suits in recombinant 
virus c ntaminated by xcess pr duction of th h lper 

35 virus. Thus, xt nsive purification of the viral vector 
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from the contaminating helper virus is required. 
However, the present invention provides a way to 
facilitate purification and reduce contamination by 
crippling the helper virus. 
5 One preferred embodiment of a helper virus of this 

invention thus contains three components (A) 
modifications or deletions of the native adenoviral gene 
sequences which direct efficient packaging, so as to 
substantially disable or "cripple 9 * the packaging function 

10 of the helper virus or its ability to replicate, (B) 

selected adenovirus genes and (C) an optional reporter 
minigene. These "crippled 91 helper viruses may also be 
formed into poly-cation conjugates as described below. 
The adenovirus sequences forming the helper virus 

15 may be obtained from the sources identified above in the 
discussion of the shuttle vector. Use of different Ad 
serotypes as helper viruses enables production of 
recombinant viruses containing the AAd (serotype 5) 
shuttle vector sequences in a capsid formed by the other 

20 serotype adenovirus. These recombinant viruses are 

desirable in targeting different tissues, or evading an 
immune response to the AAd sequences having a serotype 5 
capsid. Use of these different Ad serotype helper 
viruses may also demonstrate advantages in recombinant 

25 virus production, stability and better packaging. 
A*. The Crippling Modifications 

A desirable helper virus used in the production 
of the adenovirus vector of this invention is modified 
(or crippled) in its 5 9 ITR packaging/enhancer domain, 

30 identified above. As stated above, the 

packaging/ enhancer region contains sequences necessary 
for packaging linear adenovirus genomes ("PAC" 
s quenc s) . More specifically, this sequence contains at 
least seven distinct yet functionally redundant d mains 
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that are required for efficient encapsidation of 
replicated viral DNA. 

Within a stretch of nucleotide sequence from bp 
194-358 of the Ad5 genome, five of these so-called A- 
5 repeats or PAC sequences are localized (see. Fig. IB) . 
PAC I is located at bp 241-248 of the adenovirus genome 
(on the strand complementary to nucleotides 5259-5246 of 
SEQ ID NO: 1) . PAC II is located at bp 262-269 of the 
adenovirus genome (on the strand complementary to 
10 nucleotides 5238-5225 of SEQ ID NO: 1) . PAC III is 

located at bp 304-311 of the adenovirus genome (on the 
strand complementary to nucleotides 5196-5183 of SEQ ID 
NO: 1). PAC IV is located at bp 314-321 of the 
adenovirus (on the strand complementary to nucleotides 
15 5186-5172 of SEQ ID NO: 1) . PAC V is located at bp 339- 
346 of the adenovirus (on the strand complementary to 
nucleotides 5171-5147 of SEQ ID NO: 1) . 

Corresponding sequences can be obtained from 
SEQ ID NO: 2 and 3. PAC I is located at nucleotides 837- 
20 851 of SEQ ID NO: 2; and on the strand complementary to 

nucleotides 9374-9360 of SEQ ID NO: 3. PAC II is located 
at nucleotides 859-863 of SEQ ID NO: 2; and on the strand 
complementary to nucleotides 9353-9340 of SEQ ID NO: 3. 
PAC III is located at nucleotides 901-916 of SEQ ID NO: 
25 2; and on the strand complementary to nucleotides 9311- 
9298 of SEQ ID NO: 3. PAC IV is located at nucleotides 
911-924 of SEQ ID NO: 2; and on the strand complementary 
to nucleotides 9301-9288 of SEQ ID NO: 3. PAC V is 
located at nucleotides 936-949 of SEQ ID NO: 2; and on 
30 the strand complementary to nucleotides 9276-9263 of SEQ 
ID NO: 3. 
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Table 1 below lists these five native Ad5 
sequences and a consensus PAC sequence based on the 
similarities between an eight nucleic acid stretch within 
the five sequences. The consensus sequence contains two 
positions at which the nucleic acid may be A or T (A/T) . 
The conventional single letter designations are used for 
the nucleic acids, as is known to the art. 

Table 1 



Adenovirus Genome 
Base Pair Nos. & 
A-Repeat PVtcjeQUte pequepce 

15 241 248 

I TAG TAAATTTG GGC [SEQ ID NO: 4] 

262 269 

II AGT AAGATTTG GCC [SEQ ID NO: 5] 



304 311 

III AGT GAAATCTG AAT [SEQ ID NO: 6] 



314 321 

25 IV GAA TAATTTTG TGT [SEQ ID NO: 7] 

339 346 

V CGT AATATTTG TCT [SEQ ID NO: 8] 

30 Consensus 5 f (A/T) AN (A/T) TTTG 3 f [SEQ ID NO: 9] 

According to this invention, mutations or 
deletions may be made to one or more of these PAC 
sequences to generate desirable crippled helper viruses. 

35 A deletion analysis of the packaging domain revealed a 

positive correlation between encapsidation efficiency and 
the number of packaging A-repeats that were present at 
the 5 1 end of the genome. Modifications of this domain 
may include 5" adenovirus sequences which contain less 

40 than all fiv of th PAC s quenc s of Table 1. For 
example, nly two PAC s qu nces may be pr s nt in the 
crippled virus, e.g., PAC I and PAC II, PAC III and PAC 
IV, and so on. Deletions of selected PAC sequ nces may 
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involve deletion of contiguous or non-contiguous 
sequences. For example, PAC II and PAC IV may be 
deleted, leaving PAC I, III and IV in the 5» sequence. 
Still an alternative modification may be the replacement 
5 of one or more of the native P*»C sequences with one or 
more repeats of the consensus sequence of Table 1. 
Alternatively, this adenovirus region may be modified by 
deliberately inserted mutations which disrupt one or more 
of the native PAC sequences. One of skill in the art may 
10 further manipulate the PAC sequences to similarly achieve 
the effect of reducing the helper virus packaging 
efficiency to a desired level. 

Exemplary helper viruses which involve the 
manipulation of the PAC sequences described above are 
15 disclosed in Example 7 below. Briefly, as described in 
that example, one helper virus contains in place of the 
native 5' ITR region (adenovirus genome bp 1-360), a 5' 
adenovirus sequence spanning adenovirus genome bp 1-269, 
which contains only the 5« ITR and PAC I and PAC II 
20 sequences, and deletes the adenovirus region bp 270-360. 

Another PAC sequence modified helper virus 
contains only the 5« Ad5 sequence of the ITR and PAC I 
through PAC IV (Ad bp 1-321) , deleting PAC V and other 
sequences in the Ad region bp322-360. 
25 These modified helper viruses are characterized 

by reduced efficiency of helper virus encapsidation. 
These helper viruses with the specific modifications of 
the sequences related to packaging efficiency, provide a 
packaging efficiency high enough for generating 
production lots of the helper virus, yet low enough that 
they permit the achievement of higher yields of AdA 
transducing viral particles according to this invention. 
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&l The Selected Adenovirus Genes 

Helper viruses useful In this invention, 
whether or not they contain the "crippling" modifications 
described above , contain selected adenovirus gene 
5 sequences depending upon the call line which is 

transfected by the helper virus and shuttle vector. A 
preferred helper virus contains a variety of adenovirus 
genes in addition to the modified sequences described 
above. 

10 As one example, if the cell line employed to 

produce the recombinant virus is not a packaging cell 
line, the helper virus may be a wild type Ad virus. 
Thus, the helper virus supplies the necessary adenovirus 
early genes El, £2, £4 and all remaining late, 

15 intermediate, structural and non-structural genes of the 
adenovirus genome. This helper virus may be a crippled 
helper virus by incorporating modifications in its native 
5 9 packaging/enhancer domain. 

A desirable helper virus is replication 

20 defective and lacks all or a sufficient portion of the 

adenoviral early immediate early gene Ela (which spans mu 
1.3 to 4.5) and delayed early gene Elb (which spans mu 
4.6 to 11.2) so as to eliminate their normal biological 
functions. Such replication deficient viruses may also 

25 have crippling modifications in the packaging/ enhancer 
domain. Because of the difficulty surrounding the 
absolute removal of adenovirus from AdA preparations that 
have been enriched by CsCl buoyant density 
centrifugation, the use of a replication defective 

30 adenovirus helper prevents the introduction of infectious 
adenovirus for in vivo animal studies. This helper virus 
is employed with a packaging cell line which supplies the 
deficient £1 proteins, such as th 293 c 11 line. 
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Additionally, all or a portion of the 
adenovirus delayed early gene E3 (which spans mu 76.6 to 
86.2) may be eliminated from the adenovirus sequence 
which forms a part of the helper viruses useful in this 
5 invention, without adversely axfecting the function of 
the helper virus because this gene product is not 
necessary for the formation of a functioning virus. 

In the presence of other packaging cell lines 
which are capable of supplying adenoviral proteins in 
10 addition to the El, the helper virus may accordingly be 
deleted of the genes encoding these adenoviral proteins. 

Such additionally deleted helper viruses also desirably 
contain crippling modifications as described above. 
C± * Reporter Mlnia&ne 
15 it is also desirable for the helper virus to 

contain a reporter minigene, in which the reporter gene 
is desirably different from the reporter transgene 
contained in the shuttle vector. A number of such 
reporter genes are known, as referred to above. The 
20 presence of a reporter gene on the helper virus which is 
different from the reporter gene on the pAdA, allows both 
the recombinant AdA virus and the helper virus to be 
independently monitored. For example, the expression of 
recombinant alkaline phosphatase enables residual 
25 quantities of contaminating adenovirus to be monitored 
independent of recombinant LacZ expressed by an pAdA 
shuttle vector or an AdA virus. 

»«7 pei- vim * Polvcation Conlucretes 
Still another method for reducing the 
30 contamination of helper virus involves the formation of 
poly-cation helper virus conjugates, which may be 
associated with a plasmid containing other adenoviral 
genes, which are not pres nt in the helper virus. The 
help r viruses d scribed above may be furth r modified by 
35 resort to ad novirus-polylysine conjugate t chnology. 
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See, e.g., Wu et al, J, Biol, Chem. . 2M: 16985-16987 
(1989); and K. J. Fisher and J. M. Wilson, gjpghgmt Jt r 
299 ; 49 (April 1, 1994), incorporated herein by 
reference. 

5 Using this technology, a helper virus 

containing preferably the late adenoviral genes is 
modified by the addition of a poly-cation sequence 
distributed around the capsid of the helper virus. 
Preferably, the poly-cation is poly-lysine, which 

10 attaches around the negatively-charged vector to form an 
external positive charge. A plasmid is then designed to 
express those adenoviral genes not present in the helper 
virus, e.g., the El, E2 and/ or E4 genes. The plasmid 
associates to the helper virus-conjugate through the 

15 charges on the poly-lysine sequence. This modification 

is also desirably made to a crippled helper virus of this 
invention. This conjugate (also termed a trans- infect ion 
particle) permits additional adenovirus genes to be 
removed from the helper virus and be present on a plasmid 

20 which does not become incorporated into the virus during 
production of the recombinant viral vector. Thus, the 
impact of contamination is considerably lessened. 

III. Assembly of Shuttle Vector, Helper Virus and 

25 Production of Recombinant virus 

The material from which the sequences used in the 
pAdA shuttle vector and the helper viruses are derived, 
as well as the various vector components and sequences 
employed in the construction of the shuttle vectors, 

30 helper viruses, and AdA viruses of this invention, are 
obtained from commercial or academic sources based on 
previously published and described materials. These 
materials may als be obtained from an individual patient 
or generat d and selected using standard r combinant 

35 molecular cloning techniques known and practiced by those 
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skilled in the art. Any modification of existing nucleic 
acid sequences forming the vectors and viruses, including 
sequence deletions, insertions, and other mutations are 
also generated using standard techniques. 
5 Assembly of the selected DJA sequences of the 

adenovirus, and the reporter genes or therapeutic genes 
and other vector elements into the pAdA shuttle vector 
using conventional techniques is described in Example 1 
below. Such techniques include conventional cloning 
10 techniques of cDNA such as those described in texts 
[Sambrook et al, cited above], use of overlapping 
oligonucleotide sequences of the adenovirus genomes, 
polymerase chain reaction, and any suitable method which 
provides the desired nucleotide sequence. Standard 
15 transfection and co-transf ection techniques are employed, 
e.g., CaP0 4 transfection techniques using the HEK 293 
cell line. Other conventional methods employed in this 
invention include homologous recombination of the viral 
genomes, plaquing of viruses in agar overlay, methods of 
20 measuring signal generation, and the like. Assembly of 
any desired AdA vector or helper virus of this invention 
is within the skill of the art, based on the teachings of 
this invention. 

25 as described in detail in Example 1 below and 

with resort to Fig. 2 A and the DNA sequence of the 
plasmid reported in Fig. 3, a unique pAdA shuttle vector 
of this invention, pAdA . CMVLacZ , is generated. 
pAdA.CKVLacZ contains Ad5 sequences encoding the 5' 

30 terminal followed by a CHV promoter /enhancer, a splice 
donor /splice acceptor sequence, a bacterial beta- 
galactosidase gene (LacZ) , a SV-40 poly A sequence (pA) , 
a 3 • ITR from Ad5 and remaining plasmid s quence from 
plasmid pSP72 (Prom ga) backbone. 
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To generate the AdA genome which is 
incorporated in the vector , the plasmid pAdA • GMVLacZ must 
be must be digested with EcoRI to release the AdA.GHVLacZ 
genome, freeing the adenovirus ITRs and making them 
5 available targets for replication. Thus production of 
the vector is "restriction-dependent", i.e., requires 
restriction endonuclease rescue of the replication 
template. See, Fig. 2B. 

A second type of pAdA plasmid was designed 

10 which places the 3 1 Ad terminal sequence in a head-to- 
tail arrangement relative to the 5 1 terminal sequence. 
As described in Example 1 and Figs. 4A, and with resort 
to the DNA sequence of the plasmid reported in Fig. 5, a 
second unique AdA vector sequence of this invention, 

15 AdAc.CMVLacZ, is generated from the shuttle plasmid 

pAdAc.CMVLacZ, which contains an Ad5 5' ITR sequence and 
3' ITR sequence positioned head-to-tail, followed by a 
CMV enhancer/ promoter, SD/SA sequence, LacZ gene and pA 
sequence in a plasmid pSP72 (Pr omega) backbone. As 

20 described in Example IB, this "restriction-independent" 
plasmid permits the AdA genome to be replicated and 
rescued from the plasmid backbone without including an 
endonuclease treatment (see, Fig. 4B) . 

£*. Helper Virus 

25 As described in detail in Example 2, an 

exemplary conventional El deleted adenovirus helper virus 
is virus Ad.CBhpAP, which contains a 5» adenovirus 
sequence from mu 0-1, a reporter minigene containing 
human placenta alkaline phosphatase (hpAP) under the 

30 transcriptional control of the chicken B-actin promoter, 
followed by a poly-A sequence from SV40, followed by 
adenovirus sequences from 9.2 to 78.4 and 86 to 100. 
This help r contain d d 1 ti ns from mu 1.0 to 9.2 and 
78.4 t 86, which eliminate substantially the El region 

35 and the E3 region of the virus. This virus may be 
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desirably crippled according to this invention by 
modifications to its packaging enhancer domain. 

Exemplary crippled helper viruses of this 
invention are described using the techniques described in 
Example 7 and contain the modified 5» PAC sequences, 
i.e., adenovirus genome bp 1-269; m.u. 0-0.75 or 
adenovirus genome bp 1-321; m.u. 0-0.89. Briefly, the 5' 
sequences are modified by PCR and cloned by conventional 
techniques into a conventional adenovirus based plasmid. 
A hpAP minigene is incorporated into the plasmid, which 
is then altered by homologous recombination with an E3 
deleted adenovirus dl7001 to result in the modified 
vectors so that the reporter minigene is followed on its 
3« end with the adenovirus sequences mu 9.6 to 78.3 and 
15 87 to 100. 

Generation of a poly-L-lysine conjugate helper 
virus was demonstrated essentially as described in detail 
in Example 5 below and Fig. 10 by coupling poly-L-lysine 
to the Ad.CBhpAP virion capsid. Alternatively, the same 
20 procedure may be employed with the PAC sequence modified 
helper viruses of this invention. 
Q, p«eombin^nt AdA Virus 

As stated above, a pAdA shuttle vector in the 
presence of helper virus and/ or a packaging cell line 
25 permits the adenovirus-transgene sequences in the shuttle 
vector to be replicated and packaged into virion capsids, 
resulting in the recombinant AdA virus. The current 
method for producing such AdA virus is transfection-based 
and described in detail in Example 3. Briefly, helper 
30 virus is used to infect cells, such as the packaging cell 
line human HEK 293, which are then subsequently 
transf ected with an pAdA shuttle vector containing a 
s lected transgen by conv ntional methods. About 30 or 
more hours post-transfection, th cells ar harv sted, 
35 and an extract prepar d. The Ad* viral g nom is 
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packaged into virions that sediment at a lover density 
than the helper virus in cesium gradients. Thus, the 
recombinant AdA virus containing a selected transgene is 
separated from the bulk of the helper virus by 
5 purification via buoyant dens ivy ultracentrifugation in a 
CsCl gradient. 

The yield of AdA transducing virus is largely 
dependent on the number of cells that are transfected 
with the pAdA shuttle plasmid, making it desirable to use 

10 a transfection protocol with high efficiency. One such 
method involves use of a poly-L-lysinylated helper 
adenovirus as described above. A pAdA shuttle plasmid 
containing the desired transgene under the control of a 
suitable promoter, as described above, is then complexed 

15 directly to the positively charged helper virus caps id, 
resulting in the formation of a single transfection 
particle containing the pAdA shuttle vector and the 
helper functions of the helper virus. 

The underlying principle is that the helper 

20 adenovirus coated with plasmid pAdA DNA will co-transport 
the attached nucleic acid across the cell membrane and 
into the cytoplasm according to its normal mechanism of 
cell entry. Therefore, the poly-L- lysine modified helper 
adenovirus assumes multiple roles in the context of an 

25 AdA-based complex. First, it is the structural 

foundation upon which plasmid DNA can bind increasing the 
effective concentration. Second, receptor mediated 
endocytosis of the virus provides the vehicle for cell 
uptake of the plasmid DNA. Third, the endosomalytic 

30 activity associated with adenoviral infection facilitates 
the release of internalized plasmid into the cytoplasm. 
And the adenovirus contributes trans helper functions on 
which the recombinant AdA virus is depend nt for 
replication and packaging f transducing viral particles. 

35 The Ad-based transfection pr cedur using an pAdA shuttle 
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vector and a pol/cation-helper conjugate is detailed in 
Example 6. Additionally, as described previously, the 
helper virus-plasmid conjugate may be another form of 
helper virus delivery of the omitted adenovirus genes not 
5 present in the pAdA vector. Such a structure enables the 
rest of the required adenovirus genes to be divided 
between the plasmid and the helper virus, thus reducing 
the self-replication efficiency of the helper virus. 

A presently preferred method of producing the 
10 recombinant AdA virus of this invention involves 

performing the above-described transfection with the 
crippled helper virus or crippled helper virus conjugate, 
as described above. A "crippled- helper virus of this 
invention is unable to package itself efficiently, and 
15 therefor permits ready separation of the helper virus 
from the newly packaged AdA vector of this invention by 
use of buoyant density ultracentrifugation in a CsCl 
gradient, as described in the examples below. 



20 



IV. Function of the Recombinant AdA Virus 

Once the AdA virus of this invention is produced by 
cooperation of the shuttle vector and helper virus, the 
AdA virus can be targeted to, and taken up by, a selected 
target cell. The selection of the target cell also 
25 depends upon the use of the recombinant virus, i.e., 

whether or not the transgene is to be replicated in vitro 
or ex vivo for production in a desired cell type for 
redelivery into a patient, or in vivo for delivery to a 
particular cell type or tissue. Target cells may be any 
mammalian cell (preferably a human cell). For example, 
in in vivo use, the recombinant virus can target to any 
cell type normally infected by adenovirus, depending upon 
the route of administration, i. ., it can targ t, without 
limitation, neur ns, hepatocyt s, epithelial c lis and 
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the like* The helper adenovirus sequences supply the 
sequences necessary to permit uptake of the virus by the 
Ad A* 

Once the recombinant virus is taken up by a cell, 
5 the adenovirus flanked transgtne is rescued from the 
parental adenovirus backbone by the machinery of the 
infected cell, as with other recombinant adenoviruses. 
Once uncoupled (rescued) from the genome of the AdA 
virus, the recombinant minigene seeks an integration site 
10 in the host chromatin and becomes integrated therein, 

either transiently or stably, providing expression of the 
accompanying transgene in the host cell* 

V. Use of the AdA Viruses in Gene Therapy 

15 The novel recombinant viruses and viral conjugates 

of this invention provide efficient gene transfer 
vehicles for somatic gene therapy. These viruses are 
prepared to contain a therapeutic gene in place of the 
LacZ reporter transgene illustrated in the exemplary 

20 viruses and vectors. By use of the AdA viruses 

containing therapeutic transgenes, these transgenes can 
be delivered to a patient in vivo or ex vivo to provide 
for integration of the desired gene into a target cell. 
Thus, these viruses can be employed to correct genetic 

25 deficiencies or defects. An example of the generation of 
an AdA gene transfer vehicle for the treatment of cystic 
fibrosis is described in Example 4 below. One of skill 
in the art can generate any number of other gene transfer 
vehicles by including a selected transgene for the 

30 treatment of other disorders. 

The recombinant viruses of the present invention may 
be administered to a patient, preferably suspended in a 
bi logically compatible soluti n r pharmac utically 
acceptable delivery vehicl . A suitabl vehicle includes 

35 sterile salin . Other aqu ous and non-aqueous isotonic 
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sterile injection solutions and aqueous and non-aqueous 
sterile suspensions known to be pharmaceutical ly 
acceptable carriers and well known to those of skill in 
the art may be employed for this purpose. 
5 The recombinant viruses of this invention may be 

administered in sufficient amounts to transfect the 
desired cells and provide sufficient levels of 
integration and expression of the selected transgene to 
provide a therapeutic benefit without undue adverse 
10 effects or with medically acceptable physiological 

effects which can be determined by those skilled in the 
medical arts. Conventional and pharmaceutical ly 
acceptable parenteral routes of administration include 
direct delivery to the target organ, tissue or site, 
15 intranasal, intravenous, intramuscular, subcutaneous, 
intradermal and oral administration. Routes of 
administration may be combined, if desired. 

Dosages of the recombinant virus will depend 
primarily on factors such as the condition being treated, 
20 the selected gene, the age, weight and health of the 
patient, and may thus vary among patients. A 
therapeutically effective human dosage of the viruses of 
the present invention is believed to be in the range of 
from about 20 to about 50 ml of saline solution 
25 containing concentrations of from about 1 x 10 7 to 1 x 
10 10 pfu/ml virus of the present invention. A preferred 
human dosage is about 20 ml saline solution at the above 
concentrations. The dosage will be adjusted to balance 
the therapeutic benefit against any side effects. The 
levels of expression of the selected gene can be 
monitored to determine the selection, adjustment or 
frequency of dosage administration. 
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The following examples illustrate the construction 
of the pAdA shuttle vectors, helper viruses and 
recombinant AdA viruses of the present invention and the 
use thereof in gene therapy. These examples are 
5 illustrative only, and do not limit the scope of the 
present invention. 

Evamnle 1 - Production o f nAdA.CMVLacZ and DAdAc.CMVLacZ 
Shuttle Vectors 

10 A. pAflAtCWVfragZ 

A human adenovirus Ad5 sequence was modified to 
contain a deletion in the Ela region [map units 1 to 
9.2], which immediately follows the Ad 5' region (bp 1- 
360) (illustrated in Figs. 1A) . Thus, the plasmid 

15 contains the 5 1 ITR sequence (bp 1-103), the native 

packaging/ enhancer sequences and the TATA box for the Ela 
region (bp 104-360) . A minigene containing the CMV 
immediate early enhancer /promoter, an SD/SA sequence, a 
cytoplasmic lacZ gene, and SV40 poly A (pA) , was 

20 introduced at the site of the Ela deletion. This 

construct was further modified so that the minigene is 
followed by the 3 • ITR sequences (bp 35,353-end). The 
DNA sequences for these components are provided in Fig. 3 
and SEQ ID NO: 1 (see, also the brief description of this 

25 figure) . 

This construct was then cloned by conventional 
techniques into a pSP72 vector (Promega) backbone to make 
the circular shuttle vector pAdACMVLacZ . See the 
schematic of Fig. 2A. This construct was engineered with 

30 EcoRI sites flanking the 5' and 3 1 Ad5 ITR sequences. 
pAdA.CMVLacZ was then subjected to enzymatic digestion 
with EcoRI, releasing a linear fragment of the vector 
spanning the t rminal end of th Ad 5»ITR sequ nee 
through the t rminal nd f the 3' ITR sequence from the 

35 plasmid backbone. See Fig. 2B. 
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B. pAdAe . CMVLacZ 

The shuttle vector pAdAc. CMVLacZ (Figs. 4 A and 
5) was constructed using a pSP72 (Promega) backbone so 
that the Ad5 5' ITR and 3' ITR were positioned head-to- 
5 tail. The organization of the Ad5 ITRs was based on 
reports that suggest circular Ad genomes that have the 
terminal ends fused together head-to-tail are infectious 
to levels comparable to linear Ad genomes. A minigene 
encoding the CMV enhancer, an SD/SA sequence, the LacZ 

10 gene, and the poly A sequence was inserted immediately 

following the 5« ITR. The DNA sequence of the resulting 
plasmid and the sequences for the individual components 
are reported in Fig. 5 and SEQ ID NO: 2 (see also, brief 
description of Fig. 5) . This plasmid does not require 

15 enzymatic digestion prior to its use to produce the viral 
particle (see Example 3). This vector was designed to 
enable restriction-independent production of LacZ Ada 
vectors . 

20 Tgvaitrole 2 - Construction of a Helper VirttB 

The Ad.CBhpAP helper virus [K. Kozarsky et al, SS&x. 
r<m Moi . Genet. . 12(5) :449-458 (1993)] is a replication 
deficient adenovirus containing an alkaline phosphatase 
minigene. Its construction involved conventional cloning 

25 and homologous recombination techniques. The adenovirus 
DNA substrate was extracted from CsCl purified dl7001 
virions, an Ad5 (serotype subgroup C) variant that 
carries a 3 kb deletion between mu 78.4 through 86 in the 
nonessential E3 region (provided by Dr. William Wold, 

30 Washington University, St. Louis, Missouri) . Viral DNA 
was prepared for co-transfection by digestion with Clal 
(adenovirus genomic bp position 917) which removes the 
left arm of the genom encompassing ad novirus map units 
0-2.5. S e lower diagram of Fig. IB. 

35 



WO 96/13597 



PCIYUS95/14017 



33 

A parental cloning vector, pAd.Bglll was designed. 
It contains two segments of wild-type Ad5 genome (i.e., 
map units 0-1 and 9-16.1) separated by a unique Bglll 
cloning site for insertion of heterologous sequences. 
5 The missing Ad5 sequences between the two domains 

(adenovirus genome bp 361-3327) results in the deletion 
of Ela and the majority of Elb following recombination 
with viral DNA. 

A recombinant hpAP minigene was designed and 
10 inserted into the Bglll site of pAd.Bglll to generate the 
complementing plasmid, pAdCBhpAP. The linear arrangement 
of this minigene includes: 

(a) the chicken cytoplasmic fi-actin promoter 
[nucleotides +1 to +275 as described in T. A. Kost et al, 

15 Nucl. Acids Res., H(23):8287 (1983); nucleotides 9241- 
8684 of Fig. 7]; 

(b) an SV40 intron (e.g., nucleotides 1579-1711 of 
SEQ ID NO: 2), 

(c) the sequence for human placental alkaline 
20 phosphatase (available from Genbank) and 

(d) an SV40 polyadenylation signal (a 237 Bam HI- 
Bcll restriction fragment containing the cleavage/ poly-A 
signals from both the early and late transcription units; 
e.g., nucleotides 837-639 of SEQ ID NO: 1). 

25 The resulting complementing plasmid, pAdCBhpAP 

contained a single copy of recombinant hpAP minigene 
flanked by adenovirus coordinates 0-1 on one side and 
9.2-16.1 on the other. 

Plasmid DNA was linearized using a unique Nhel site 

30 immediately 5' to adenovirus map unit zero (0) and the 
above- identified adenovirus substrate and the 
complementing plasmid DNAs were transfected to 293 cells 
[ATCC CRL1573] using a standard calcium phosphate 
transfection procedure [se , e.g., Sambrook et al, cit d 

35 abov ]. The end result f h mologous recombination 
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involving sequences that map to adenovirus map units 9- 
16.1 is hybrid Ad . CBhpAP helper virus which contains 
adenovirus map units 0-1 and, in place of the Ela and Elb 
coding regions from the dl7001 adenovirus substrate, is 
5 the hpAP minigene from the plasmid, followed by Ad 

sequences 9 to 100, with a deletion in the E3 (78.4-86 
mu) regions. 

By^ ple 3 - Pro duction of Recombinant AdA Virus 
10 The recombinant AdA virus of this invention are 

generated by co-transfection of a shuttle vector with the 
helper virus in a selected packaging or non-packaging 
cell line. 

As described in detail below, the linear fragment 
15 provided in Example 1A, or the circular AdA genome 

carrying the LacZ of Example IB, is packaged into the 
Ad. CBhpAP helper virus (Example 2) using conventional 
techniques, which provides an empty capsid head, as 
illustrated in Fig. 2C. Those virus particles which have 
20 successfully taken up the pAd shuttle genome into the 
capsid head can be distinguished from those containing 
the hpAP gene by virtue of the differential expression of 

LacZ and hpAP. 

In more detail, 293 cells (4 x 10 7 pfu 293 cells/ 150 

25 mm dish) were seeded and infected with helper virus 

Ad. CBhpAP (produced as described in Example 2) at an MOI 
of 5 in 20 ml DMEM/2% fetal bovine serum (FBS) . This 
helper specific marker is critical for monitoring the 
level of helper virus contamination in AdA preparations 

30 before and after purification. The helper virus provides 
in trans the necessary helper functions for synthesis and 
packaging of the AdACMVLacZ genome. 

Two hours post infection, using eith r th 
restriction-d pend nt shuttle v ctor or th restricti n- 

35 ind pendent shuttle vector, plasmid pAdA.CKVLacZ 
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(digested with ScoRI) or pAdAc.CMVLacZ DNA, each carrying 
a LacZ minigene, was added to the cells by a calcium 
phosphate precipitate (2.5 ml calcium phosphate 
transf ection cocktail containing 50 fig plasmid DNA) . 
5 Thirty to forty hours poti- transf ection, cells were 

harvested, suspended in 10 mM Tris-Cl (pH 8.0) (0.5 
ml/ 150 mm plate) and frozen at -80°C. Frozen cell 
suspensions were subjected to three rounds of freeze 
(ethanol-dry ice) -thaw (37 °C) cycles to release virion 

10 capsids. Cell debris was removed by centrifugation 

(5,000xg for 10 minutes) and the clarified supernatant 
applied to a CsCl gradients to separate recombinant virus 
from helper virus as follows. 

Supernatants (10 ml) applied to the discontinuous 

15 CsCl gradient (composed of equal volumes of CsCl at 1.2 
g/ml, 1.36 g/ml, and 1.45 g/ml 10 mM Tris-Cl (pH 8.0)) 
were centrifuged for 8 hours at 72,128Xg, resulting in 
separation of infectious helper virus from incompletely 
formed virions. Fractions were collected from the 

20 interfacing zone between the helper and top components 
and analyzed by Southern blot hybridization or for the 
presence of LacZ transducing particles. For functional 
analysis, aliquots (2.0 ml from each sample) from the 
same fractions were added to monolayers of 293 cells (in 

25 35 mm wells) and expression of recombinant 6- 
galactosidase determined 24 hours later. More 
specifically, monolayers were harvested, suspended in 0.3 
ml 10 mM Tris-Cl (pH 8.0) buffer and an extract prepared 
by three rounds of freeze- thaw cycles. Cell debris was 

30 removed by centrifugation and the supernatant tested for 
fi-galactosidase (LacZ) activity according to the 
procedure described in J. Price et al, Proc. Natl. Acad. 
Sci. . USA . fiA: 156-160 (1987). Th specific activity 
(milliunits fi-galactosidase/mg protein or rep rter 
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enzymes was measured from indicator cells. For the 
recombinant virus, specific activity was 116. 

Fractions with fl-galactosidase activity from the 
discontinuous gradient were sedimented through an 
5 equilibrium cesium gradient to further enrich the 
preparation for AdA virus. A linear gradient was 
generated in the area of the recombinant virus spanning 
densities 1.29 to i.34gm/ml. A sharp peak of the 
recombinant virus, detected as the appearance of the B- 
10 gal activity in infected 293 cells, eluted between 1.31 
and 1.33 gm/dl. This peak of recombinant virus was 
located between two major A 260 nm absorbing peaks and in 
an area of the gradient with the helper virus was 
precipitously dropping off. The equilibrium 
15 sedimentation gradient accomplished another 102 to 103 

fold purification of recombinant virus from helper virus. 
The yield of recombinant AdA.CMVLacZ virus recovered from 
a 50 plate prep after 2 sedimentations ranged from 107 to 
108 transducing particles. 
20 Analysis of lysates of cells transfected with the 

recombinant vector and infected with helper revealed 
virions capable of transducing the recombinant minigene 
contained within the vector. Subjecting aliquots of the 
fractions to Southern analysis using probes specific to 
25 the recombinant virus or helper virus revealed packaging 
of multiple molecular forms of vector derived sequence. 
The predominant form of the deleted viral genome was the 
size ("5.5 kb) of the corresponding double stranded DNA 
monomer (AdA . GHVLacZ ) with less abundant but discrete 
30 higher molecular weight species ("10 kb and "15 kb) also 
present. Full-length helper virus is 35kb. Importantly, 
the peak of vector transduction activity corresponds with 
th high st molecular weight f rm of the d leted virus. 
These r suits c nf irm the hyp th sis that ITRs and 
35 contiguous packaging sequence ar the only lements 
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necessary for incorporation into virions. An apparently 
ordered or preferred rearrangement of the recombinant Ad 
monomer genome leads to a more biologically active 
molecule. The fact that larger molecular species of the 
5 deleted genome are 2x and 3x lold larger than the monomer 
deleted virus genome suggests that the rearrangements may 
involve sequential duplication of the original genome. 

These same procedures may be adapted for production 
of a recombinant AdA virus using a crippled helper virus 
10 or helper virus conjugate as described previously. 

Example 4 - Regpnfrinapt ftdft vim? containing a 

Therapeutic Minigene 

To test the versatility of the recombinant AdA virus 
15 system, the reporter LacZ minigene obtained from 

pAdACMVLacZ was cassette replaced with a therapeutic 

minigene encoding CFTR. 

The minigene contained human CFTR cDNA [Riordan et 

al, Science . 2A£:1066-1073 (1989); nucleotides 8622-4065 
20 of SEQ ID NO: 3] under the transcriptional control of a 

chimeric CMV enhancer/chicken 6-actin promotor element 

(nucleotides +1 to +275 as described in T. A. Kost et al, 

Nucl. Acids Res, . H(23) :8287 (1983); nucleotides 9241- 

8684 of SEQ ID NO: 3, Fig. 7); and followed by an SV-40 
25 poly-A sequence (nucleotides 3887-3684 of SEQ ID NO: 3, 

Fig. 7) . 

The CFTR minigene was inserted into the El deletion 
site of an Ad5 virus (called pAd.ElA) which contains a 
deletion in Ela from mu 1-9.2 and a deletion in E3 from 
30 mu 78.4-86. 

The resulting shuttle vector called pAdA • CBCFTR (see 
Figs. 6 and the DNA sequence of Fig. 7 [SEQ ID NO: 3]) 
used the sam Ad ITRs of pAdACMVLacZ, but the Ad5 
sequ nces terminat d with Nhel sites inst ad f EcoRI. 
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Therefore releass of the minigene from the plasmid was 

accomplished by digestion with Nhel. 

The vector production system described in Example 3 

was employed, using the helper virus Ad.CBhpAP (Example 
5 2) . Monolayers of 293 cells grown to 80-90% conf luency 

in 150 mm culture dishes were infected with the helper 

virus at an MOI of 5. Infections were done in DMEM 

supplemented with 2% PBS at 20 ml media/ 150 mm plate. 

Two hours post-infection, 50 ng plasmid DNA in 2.5 ml 
10 transfection cocktail was added to each plate and evenly 

distributed. 

Delivery of the pAdA . CBCFTR plasmid to 293 cells was 
mediated by formation of a calcium phosphate precipitate 
and AdA . CBCFTR virus resolved from Ad.CBhpAP helper virus 

15 by CsCl buoyant density ultracentrifugation as follows: 
Cells were left in this condition for 10-14 h, 
afterwhich the infection/transf ection media was replaced 
with 20 ml fresh DMEM/ 2% FBS. Approximately 30 h post- 
transf ection, cells were harvested, suspended in 10 mM 

20 Tris-Cl (pH 8.0) buffer (0.5 ml/150 mm plate), and stored 
at -80°C. 

Frozen cell suspensions were lysed by three 
sequential rounds of freeze (ethanol-dry ice) -thaw 
(37°C). Cell debris was removed by centrifugation (5,000 

25 x g for 10 min) and 10 ml clarified extract layered onto 
a CsCl step gradient composed of three 9.0 ml tiers with 
densities 1.45 g/ml, 1.36 g/ml, and 1.20 g/ml CsCl in 10 
mM Tris-Cl (pH 8.0) buffer. Centrifugation was performed 
at 20,000 rpm in a Beckman SW-28 rotor for 8 h at 4°C. 

30 Fractions (1.0 ml) were collected from the bottom of the 
centrifuge tube and analyzed for rAAd transducing 
vectors. Peak fractions were combined and banded to 
equilibrium. Fractions containing transducing virions 
wer dialyzed against 20 mM HEPES (pH 7.8) /150 mM NaCl 
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(HBS) and stored frozen at -80°C in the presence of 10% 
glycerol or as a liquid stock at -20°C (HBS+40% 
glycerol) . 

Fractions collected after ultracentrifugation were 
5 analyzed for transgene expression and vector DNA. For 
lacZ ArAd vectors, 2 pi aliquots were added to 293 cell 
monolayers seeded in 35 mm culture wells. Twenty-four 
hours later cells were harvested, suspended in 0.3 ml 10 
mH Tris-Cl (pH 8.0) buffer, and lysed by three rounds of 

10 freeze-thaw. Cell debris was removed by centrifugation 
(15,000 x g for 10 min) and assayed for total protein 
[Bradford, (1976)] and p-galactosidase activity [Sambrook 
et al, (1989)] using ONPG (o-Nitrophenyl p-D- 
galactopyranoside) as substrate. 

15 Expression of CFTR protein from the AdA.CBCFTR 

vector was determined by immunofluorescence localization. 
Aliquots of AdA.CBCFTR, enriched by two-rounds of 
ultracentrifugation and exchanged to HBS storage buffer, 
were added to primary cultures of airway epithelial cells 

20 obtained from the lungs of CF transplant recipients. 
Twenty-four hours after the addition of vector, cells 
were harvested and affixed to glass slides using 
centrifugal force (Cytospin 3, Shandon Scientific 
Limited) . Cells were fixed with freshly prepared 3% 

25 paraformaldehyde in PBS (1.4 mM KH 2 P0 4 , 4.3 mM Na 2 HP0 4 , 
2.7 mN KC1, and 137 mM NaCl) for 15 min at room 
temperature (RT) , washed twice in PBS, and permeabilized 
with 0.05% NP-40 for 10 min at RT. The 
immunofluorescence procedure began with a blocking step 

30 in 10% goat serum (PBS/6S) for 1 h at RT, followed by 

binding of the primary monoclonal mouse anti-human CFTR 
(R-domain specific) antibody (Genzyme) diluted 1:500 in 
PBS/GS f r 2 h at RT. Cells wer washed ext nsiv ly in 
PBS/GS and incubat d for 1 h at RT with a donkey anti- 

35 mouse IgG (H+L) FITC conjugated 
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antibody (Jackson ImmunoResearch Laboratories) diluted 
1:100 in PBS/GS. 

For southern analysis of vector DNA, 5 nl aliquots 
were taken directly from CsCl fractions and incubated 
5 with 20 Ml capsid digestion butfer (50 mM Tris-Cl, pH 
8.0; 1.0 mM EDTA, pH 8.0; 0.5% SDS, and 1.0 mg/ml 
Proteinase K) at 50°C for 1 h. The reactions were 
allowed to cool to RT, loading dye was added, and 
electrophoresed through a 1.2% agarose gel. Resolved 
10 DNAs were electrob lotted onto a nylon membrane (Hybond-N) 
and hybridized with a 32-P labeled restriction fragment. 
Blots were analyzed by autoradiography or scanned on a 
Phosphorimager 445 SI (Molecular Dynamics) . 

The results that were obtained from Southern blot 
15 analysis of gradient fractions revealed a distinct viral 
band that migrated faster than the helper Ad.CBhpAP DNA. 
The highest viral titers mapped to fractions 3 and 4. 
Quantitation of the bands in fraction 4 indicated the 
titer of Ad.CBhpAP was approximately 1.5x greater than 
20 AdACBCFTR. However, if the size difference between the 
two viruses is factored in (Ad.CBhpAP=35 kb; 
AdACBCFTR=6 . 2 kb) , the viral titer (where 1 particle=l 
DNA molecule) of AdACB.CFTR is at least 4-fold greater 
than the viral titer of Ad.CBhpAP. 
25 While Southern blot analysis of gradient fractions 

was useful for showing the production of AdA viral 
particles, it also demonstrated the utility of 
ultracentrifugation for purifying AdA viruses. 
Considering the latter of these, both LacZ and CFTR 
30 transducing viruses banded in CsCl to an intermediate 
density between infectious adenovirus helper virions 
(1.34 g/ml) and incompletely formed caps ids (1.31 g/ml) . 
Th lighter density relative to h lp r virus likely 
results from th smaller genom carri d by th AdA 
35 virus s. This further suggests changes in virus size 
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influences the density and purification of AdA virus. 
Regardless, the ability to separate AdA virus from the 
helper virus is an important observation and suggests 
further purification may be achieved by successive rounds 
5 of banding through CsCl. 

This recombinant virus is useful in gene therapy 
alone, or preferably, in the form of a conjugate prepared 
as described herein. 

10 Example 5 - Correction of Genetic Defe ct in CF airway 
Epithelial Cells with AdACB.CFTO 

Treatment of cystic fibrosis, utilizing the 
recombinant virus provided above, is particularly suited 
for in vivo, lung-directed, gene therapy. Airway 

15 epithelial cells are the most desirable targets for gene 
transfer because the pulmonary complications of CF are 
usually its most morbid and life-limiting. 

The recombinant AdaCB.CFTR virus was fractionated on 
sequential CsCl gradients and fractions containing CFTR 

20 sequences, migrating between the adenovirus and top 

components fractions described above were used to infect 
primary cultures of human airway epithelial cells derived 
from the lungs of a CF patient. The cultures were 
subsequently analyzed for expression of CFTR protein by 

25 immunocytochemistry . Immunof luorescent detection with 
mouse anti-human CFTR (R domain specific) antibody was 
performed 24 hours after the addition of the recombinant 
virus. Analysis of mock infected CF cells failed to 
reveal significant binding to the R domain specific CFTR 

30 antibody. Primary airway epithelium cultures exposed to 
the recombinant virus demonstrated high levels of CFTR 
protein in 10-20% of the cells. 

Thus, th recombinant virus f the invention, 
containing the CFTR gene, may be d livered directly into 

35 the airway, e.g. by a formulating the virus abov , into a 
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preparation which can be inhaled. For example, the 
recombinant virus or conjugate of the invention 
containing the CFTR gene, is suspended in 0.25 molar 
sodium chloride. The virus or conjugate is taken up by 
5 respiratory airway cells and tl»e gene is expressed. 

Alternatively, the virus or conjugates of the 
invention may be delivered by other suitable means, 
including site-directed injection of the virus bearing 
the CFTR gene. In the case of CFTR gene delivery, 
10 preferred solutions for bronchial instillation are 

sterile saline solutions containing in the range of from 
about l x 10 7 to 1 x 10 10 pfu/ml, more particularly, in 
the range of from about 1 x 10 8 to 1 x 10 9 pfu/ml of the 
virus of the present invention. 
15 Other suitable methods for the treatment of cystic 

fibrosis by use of gene therapy recombinant viruses of 
this invention may be obtained from the art discussions 
of other types of gene therapy vectors for CF. See, for 
example, U. S. Patent No. 5,240,846, incorporated by 
20 reference herein. 

f; OTff pl«» 6 - Synthesis o f Polvcation Helper Virus 
Conjugate 

Another version of the helper virus of this 
25 invention is a polylysine conjugate which enables the 

pAdA shuttle plasmid to complex directly with the helper 
virus capsid. This conjugate permits efficient delivery 
of shuttle plasmid pAdA shuttle vector in tandem with the 
helper virus, thereby removing the need for a separate 
30 transfection step. See, Fig. 10 for a diagrammatic 
outline of this construction. Alternatively, such a 
conjugate with a plasmid supplying some Ad genes and the 
helper supplying the remaining n cessary genes for 
production of th AdA viral vector provid s a nov 1 way 
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to reduce contamination of the helper virus, as discussed 
above. 

Purified stocks of a large-scale expansion of 
Ad.CBhpAP were modified by coupling poly-L- lysine to the 
5 virion capsid essentially as described by K. J. Fisher 
and J. M. Wilson, Biochem. J. . 222:49-58 (1994), 
resulting in an Ad.CBhpAP-(Lys) n conjugate. The 
procedure involves three steps. 

First, CsCl band purified helper virus Ad.CBhpAP was 

10 reacted with the heterobifunctional cross linker sulfo- 
SMCC [ sulf o- (N-succinimidyl-4- (N-maleimidomethyl) 
cyclohexane-l-carboxylate] (Pierce) . The conjugation 
reaction, which contained 0.5 mg (375 nmol) of sulpho- 
SMCC and 6 x 10 12 A 260 helper virus particles in 3.0 ml of 

15 HBS, was incubated at 30°C for 45 minutes with constant 
gentle shaking. This step involved formation of a 
peptide bond between the active N-hydroxysuccinimide 
(NHS) ester of sulpho-SMCC and a free amine (e.g. lysine) 
contributed by an adenovirus protein sequence (capsid 

20 protein) in the vector, yielding a maleimide-activated 
viral particle. The activated adenovirus is shown in 
Fig. 10 having the capsid protein fiber labeled with the 
nucleophilic maleimide moiety. In practice, other capsid 
polypeptides including hexon and penton base are also 

25 targeted. 

Unincorporated, unreacted cross-linker was removed 
by gel filtration on a 1 cm x 15 cm Bio-Gel P-6DG (Bio- 
Rad Laboratories) column equilibrated with 50 mM Tris/HCl 
buffer, pH 7.0, and 150 mM NaCl. Peak A 260 fractions 

30 containing maleimide-activated helper virus were combined 
and placed on ice. 

Second, poly-L- lysine having a molecular mass of 58 
kDa at 10 mg/ml in 50 mM triethanolamine buffer (pH 8.0), 
150 mM NaCl and 1 mM EDTA was thiolat d with 2- 

35 imminothiolane/HCl (Traut's R agent; Pierce) to a molar 
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ratio of 2 moles -SH/mole poly lysine under N 2 ; the cyclic 
thioimidate reacts with the poly (L-lysine) primary amines 
resulting in a thiolated polycation. After a 45 minute 
incubation at room temperature the reaction was applied 
5 to a 1 cm x 15 cm Bio-Gel P6DG column equilibrated with 
50 mM Tris/HCl buffer (pH 7.0), 150 mM NaCl and 2 mM EDTA 
to remove unincorporated Traut's Reagent. 

Quantification of free thiol groups was accomplished 
with Ellman's reagent [5,5'-dithio-bis-(2-nitrobenzoic 

10 acid)], revealing approximately 3-4 mol of -SH/mol of 

poly (L-lysine) . The coupling reaction was initiated by 
adding 1 x 10 12 A 260 particles of maleimide-activated 
helper virus/mg of thiolated poly (L-lysine) and 
incubating the mixture on ice at 4°C for 15 hours under 

15 argon. 2-mercaptoethylamine was added at the completion 
of the reaction and incubation carried out at room 
temperature for 20 minutes to block unreacted maleimide 
sites. 

Virus-polylysine conjugates, Ad.CPAP-p(Lys) n , were 
20 purified away from unconjugated poly (L-lysine) by 

ultracentrifugation through a CsCl step gradient with an 
initial composition of equal volumes of 1.45 g/ml (bottom 
step) and 1.2 g/ml (top step) CsCl in 10 mM Tris/HCl 
buffer (pH 8.0). Centrifugation was at 90,000 g for 2 
25 hours at 5°C. The final product was dialyzed against 20 
mM Hepes buffer (pH 7.8) containing 150 mM NaCl (HBS) . 

sample 7 - Formation of AdA /helner-pLvs Viral Particle 
The formation of Ad.CBhpAP-pLys/pAdA.CMVLacZ 

30 particle is initiated by adding 20 Mg plasmid 

pAdA.CMVLacZ DNAs to 1.2 x 10 12 A 260 particles Ad.CBhpAP- 
pLys in a final volume of 0.2 ml DMEM and allowing the 
c mpl x to d velop at room temperature for betwe n 10-15 
minutes. This ratio typically r pr sents the plasmid DNA 

35 binding capacity of a standard lot of ad novirus-pLys 
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conjugate and gives the highest levels of plasmid 
transgene expression. 

The resulting trans-infection particle is 
transfected onto 293 cells (4 x 10 7 cells seeded on a 150 
5 mm dish) . Thirty hours after transfection, the particles 
are recovered and subjected to a freeze/thaw technique to 
obtain an extract. The extract is purified on a CsCl 
step gradient with gradients at 1.20 g/ml, 1.36 g/ml and 
1.45 g/ml. After centrifugation at 90,000 x g for 8 
10 hours, the AdA vectors were obtained from a fraction 

under the top components as identified by the presence of 
LacZ, and the helper virus was obtained from a smaller, 
denser fraction, as identified by the presence of hpAP. 

15 Example 8 - Construction of Modified Helper Viruses with 
Crippled Packaging (PAC1 Sequences 

This example refers to Figs. 9A through 9C, 10A and 
10B in the design of modified helper viruses of this 
invention. 

20 Ad5 5 1 terminal sequences that contained PAC domains 

I and II (Fig. 8A) or PAC domains I, II, III, and IV 
(Fig. 8B) were generated by PGR from the wild type Ad5 5 1 
genome depicted in Fig. IB using PGR clones indicated by 
the arrows in Fig. IB. The resulting amplification 

25 products (Fig. 8A and 8B) sequences differed from the 

wild-type Ad5 genome in the number of A-repeats carried 
by the left (5') end. 

As depicted in Fig. 8C, these amplification products 
were subcloned into the multiple cloning site of 

30 pAd.Link.l (IH6T Vector Core). pAd.Link.l is a 

adenovirus based plasmid containing adenovirus m.u. 9.6 
through 16.1. The insertion of the modified PAC regions 
into pAd.Link.l g n rat d tw vectors p Ad. PAC I I 
(c ntaining PAC domains I and II) and pAd.PACIV 

35 (c ntaining PAC domains I, II, III, and IV). 
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Thereafter, as depicted in Figs. 10A and 10B, for 
each of these plasmids, a human placenta alkaline 
phosphatase reporter minigene containing the immediate 
early CMV enhancer/promoter (CMV) , human placenta 
5 alkaline phosphatase cDNA (hpAt), and SV40 

polyadenylation signal (pA) , was subcloned into each PAC 
vector, generating pAd.PACII.CMVhpAP and 
pAd.PACIV.CMVhpAP, respectively. 

These plasmids were then used as substrates for 
10 homologous recombination with dl7001 virus, described 
above, by co-transf ection into 293 cells. Homologous 
recombination occurred between the adenovirus map units 
9-16 of the plasmid and the crippled Ad5 virus. The 
results of homologous recombination were helper viruses 
15 containing Ad5 5 1 terminal sequences that contained PAC 
domains I and II or PAC domains I, II, III, and IV, 
followed by the minigene, and Ad5 3 1 sequences 9.6-78.3 
and 87-100. Thus, these crippled viruses are deleted of 
the El gene and the E3 gene. 
20 The plaque formation characteristics of the PAC 

helper viruses gave an immediate indication that the PAC 
modifications diminished the rate and extent of growth. 
Specifically, PAC helper virus plaques did not develop 
until day 14-21 post-trans feet ion, and on maturation 
25 remained small. From previous experience, a standard 

first generation Ad.CBhpAP helper virus with a complete 
left terminal sequence would begin to develop by day 7 
and mature by day 10. 

Viral plaques were picked and suspended in 0.5 ml of 
30 DMEM media. A small aliquot of the virus stock was used 
to infect a fresh monolayer of 293 cells and 
histochemically stained for recombinant alkaline 
phosphatas activity 24 hours p st-inf cti n. Six of 
eight Ad.PACIV.CMVhpAP ( ncod s A-r p ats I-IV) clon s 
35 that w r sere n d for transgen expression were 
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positive, while all three Ad.PACII.CMVhpAP clones that 
were selected scored positive. The clones have been 
taken through two rounds of plaque purification and are 
currently being expanded to generate a working stock. 
5 These crippled helper viruses are useful in the 

production of the AdA virus particles according to the 
procedures described in Example 3. They are 
characterized by containing sufficient adenovirus genes 
to permit the packaging of the shuttle vector genome, but 

10 their crippled PAC sequences reduce their efficiency for 
self-encapsidation. Thus less helper viruses are 
produced in favor of more AdA recombinant viruses. 
Purification of AdA virus particles from helper viruses 
is facilitated in the CsCl gradient, which is based on 

15 the weight of the respective viral particles. This 

facility in purification is a decided advantage of the 
AdA vectors of this invention in contrast to adenovirus 
vectors having only El or smaller deletions. The AdA 
vectors even with minigenes of up to about 15 kb are 

20 significantly different in weight than wild type or other 
adenovirus helpers containing many adenovirus genes. 

Example 9 - AdA Vector Containing a full-lencrth 
dystrophin transoene 

25 Duchenne muscular dystrophy (DMD) is a common x- 

1 inked genetic disease caused by the absence of 
dystrophin, a 427K protein encoded by a 14 kilobase 
transcript. Lack of this important sarcolemmal protein 
leads to progressive muscle wasting, weakness, and death. 

30 One current approach for treating this lethal disease is 
to transfer a functional copy of the dystrophin gene into 
the affected muscles. For skeletal muscle, a 
replication-defective ad novirus r presents an efficient 
d livery system • 



35 
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According to the present invention, a recombinant 
plasmid pAdA.CMVmdys was created which contains only the 
Ad5 cis-elements (i.e., ITRs and contiguous packaging 
sequences) and harbors the full-length murine dystrophin 
5 gene driven by the CMV promoter. This plasmid was 
generated as follows. 

pSL1180 [Pharmacia Biotech] was cut with Not I, 
filled in by Klenow, and religated thus ablating the Not 
I site in the plasmid. The resulting plasmid is termed 
10 pSL1180NN and carries a bacterial ori and Amp resistance 
gene. 

pAdA.CMVLacZ of Example 1 was cut with EcoRI, 
klenowed, and ligated with the Apal-cut pSL1180NN to form 
pAdA.CMVLacZ (Apal) . 

15 The 14 kb mouse dystrophin cDNA [sequences 

provided in C. C. Lee et al, Nature . 312:334-336 (1991)] 
was cloned in two large fragments using a lambda ZAP 
cloning vector (Stratagene) and subsequently cloned into 
the bluescript vector pSK- giving rise to the plasmid 

20 pCCL-DMD. A schematic diagram of this vector is provided 
in Fig. 11 , which illustrates the restriction enzyme 
sites. 

pAdA.CMVIrac^ (Apal) was cut with Not I and the 
large fragment gel isolated away from the lacZ cDNA. 
25 pCCL-DMD was also cut with Not I, gel isolated and 

subseqently ligated to the large NotI fragment of NotI 
digested pAdA.CMVLacZ (Apal) . The sequences of resulting 
vector, pAdA.CMVmdys, are provided in Fig. 12A-12P [SEQ 
ID NO:10]. 

30 This plasmid contains sequences form the left- 

end of the Ad5 encompassing bp 1-360 (5» ITR) , a mouse 
dystrophin minigene under the control of the CMV 
promoter, and sequence from the right end of Ad5 spanning 
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bp 35353 to the end of the genome (3 f ITR) . The minigene 
is followed by an SV-40 poly-A sequence similar to that 
described for the plasmids described above. 

The vector production system described herein is 
5 employed. Ten 150mm 293 plates are infected at about 90% 
conf luency with a reporter recombinant El -deleted virus 
Ad.CBhpAP at an MOI of 5 for 60 minutes at 37 °C. These 
cells are transfected with pAdA.CMVmDys by calcium 
phosphate co-precipitation using 50 M9 linearized 

10 DNA/dish for about 12-16 hours at 37 °C. Media is 
replaced with DMEM + 10% fetal bovine serum. 

Full cytopathic effect is observed and a cell lysate 
is made by subjecting the cell pellet to freeze-thaw 
procedures three times. The cells are subjected to an 

15 SW41 three tier CsCl gradient for 2 hours and a band 

migrating between the helper adenovirus and incomplete 
virus is detected. 

Fractions are assayed on a 6 well plate containing 
293 cells infected with 5X of fraction for 16-20 hours in 

20 DMEM + 2% FBS. Cells are collected, washed with 

phosphate buffered saline, and resuspended in 2 ml PBS. 
200X of the 2ml cell fractions is cytospun onto a slide. 

The cells were subjected to immunofluorescence for 
dystrophin as follows. Cells were fixed in 10N MeOH at 

25 -20°C The cells were exposed to a monoclonal antibody 
specific for the car boxy terminus of human dystrophin 
[NCL-DYS2 ; Novocastra Laboratories Ltd., UK]. Cells were 
then washed three times and exposed to a secondary 
antibody, i.e. 1:200 goat anti-mouse IgG in FITC. 

30 The titer/ fraction for seven fractions revealed in 

the immuno fluorescent stains were calculated by the 
following formula and reported in Table 2 below. 
DFU/field - (DFU/200X c lis) x 10 - DFU/10 6 c lis - 
(DFU/5X viral fraction) x 20 = DFU/100X fraction. 



35 
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10 



15 



Table 2 



i 

2 
3 



6 X 10 3 



4 1.8 X 10* 

9.6 X 10 J 
200 



200 



A virus capable of transducing the dystrophin 
minigene is detected as a "positive" (i.e., green 

20 fluorescent) cell. The results of the IF illustrate that 
heat-treated fractions do not show positive 
immunofluorescence. Southern blot data suggest one 
species on the same size as the input DNA, with helper 
virus contamination. 

25 The recombinant virus can be subsequently separated 

from the majority of helper virus by sedimentation 
through cesium gradients. Initial studies demonstrate 
that the functional AdCMVAmDys virions are produced, but 
are contaminated with helper virus. Successful 

30 purification would render AdA virions that are incapable 
of encoding viral proteins but are capable of transducing 
murine skeletal muscle. 

^YflmplP. 10 - Pseudotvpina 
35 The following experiment provides a method for 

preparing a recombinant AclA according to the invention, 
utilizing helper viruses from serotypes which differ from 
that of the pAdA in the transf ection/ infection protocol. 
It is unexpected that the ITRs and packaging sequ nee of 
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Ad5 could be incorporated into a virion of another 
serotype • 

A. Protocol 

The basic approach is to transfect the 
5 AdA.CMVlacZ recombinant virus (Ad5) into 293 cells and 
subsequently infect the cell with the helper virus 
derived from a variety of Ad serotypes (2, 3, 4, 5, 7, 8, 
12, and 40). When CPE is achieved, the lysate is 
harvested and banded through two cesium gradients. 

10 More particularly, the Ad5-based plasmid 

pAdA.CMVIacZ of Example 1 was linearized with EcoRX. The 
linearized plasmids were then transfected into ten 150 mm 
dishes of 293 cells using calcium phosphate co- 
precipitation. At 10-15 hours post transfect ion, wild 

15 type adenoviruses (of one of the following serotypes: 2, 
3, 4, 5, 7, 12, 40) were used to infect cells at an MOI 
of 5. The cells were then harvested at full CPE and 
lysed by three rounds of freeze-thawing. Pellet is 
resuspended in 4 mL Tris-HCl. Cell debris was removed by 

20 centrifugation and partial purification of Ad5A. CKVIacZ 
from helper virus was achieved with 2 rounds of CsCl 
gradient centrifugation (SW41 column, 35,000 rpm, 2 
hours) . Fractions were collected from the bottom of the 
tube (fraction #1) and analysed for lacZ transducing 

25 viruses on 293 target cells by histochemical staining (at 
2 Oh PI) . Contaminating helper viruses were quant itated 
by plaque assay. 

Except for adenovirus type 3, infection with Ad 
serotypes 2, 4, 5, 7, 12 and 40 were able to produce lacZ 

30 transducing viruses. The peak of (J-galactosidase 
activity was detected between the two major A 26 q 
absorbing peaks, where most of the helper viruses banded 
(data not shown) . Th quantity of lacZ virus r covered 
from 10 plat s rang d from 10 4 to 10 B transducing 

35 particles d pending on the serotype of the helper. As 
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expected Ad2 and Ad5 produced the highest titer of lacZ 
transducing viruses (Table 3). Wild type contamination 
was in general 10 2 -10 3 log higher than corresponding lacZ 
titer except in the case of Ad40. 
B. Results 

Table 3 summarizes the growth characteristics 
of the wild type adenoviruses as evaluated on propagation 
in 293 cells. This demonstrated- the feasibility of 
utilizing these helper viruses to infect the cell line 
which has been transfected with the Ad5 deleted virus. 

Table 3 





Adenovirus serotypes 




p/ml 


pfu/ml 


p:pfu 


15 


2 


5 


x 10 12 


2.5 X 10 11 


20:01 




3 


1 


x 10 12 


6.25 X 10 9 


160:1 


20 


4 


3 


X 10 12 


2 X 10 9 


150:1 




5 


1 


X 10 12 


5 X 10 10 


20:01 




7a 


5 


X 10 12 


1 x 10 11 


50:1 


25 


12 


6 


X 10 11 


4 X 10 9 


150:1 




35 


1.2 


X 10 12 






30 


40 


2.2 


x 10 12 


4.4 X 10 8 


5000:1 



Table 4 summarizes the results of the final 
purified fractions. The middle column, labeled LFU/jil 

35 quantifies the production of lacZ forming units, which is 
a direct measure of the packaging and propagation of 
pseudotyped recombinant AdA virus. The pfu//il titer is 
an estimate of the contaminating wild type virus. AdA 
virus pseudotyped with all adenoviral strains was 

40 g n rated except for Ad3. The titers range between 10 7 - 
10 4 . 
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Table 4 
Serotypes LFU/ml 
2 4.6 x 10' 



PFU/ml 
1.8 x 10 9 



10 



15 



20 



25 



30 



35 



40 



45 



3 
4 
5 
7a 
12 
40 



6.7 X 10 e 



6.3 X 10' 
3 X 10 6 



1.2 X 10 s 



9.5 X 10* 



NA 

9.3 X 10 7 
1.9 X 10 9 
1.8 X 10 8 
3.3 X 10 8 
1.5 X 10 3 



Table 5A-5D represents a more detailed analysis 
of the fractions from the second purification for each of 
the experiments summarized in Table 4. Again, LFV/nl is 
the recovery of the AdA viruses, whereas pfu/jtl 
represents recovery of the helper virus. 





Table 


5A 




Ad2 Fraction # 


VOLUME/ul 


LFU/ul 


PFU/Ul 


1 


120 


9532 


8 x 10 6 


2 


100 


5.8 X 10* 


3 X 10 6 


3 


100 


8.24 X 10 4 


6 X 10 s 


4 


100 


9.47 x 10 4 


1.2 X 10 s 


5 


100 


6 X 10 4 


8 X 10 4 


6 


100 


2 X 10 4 


6 X 10 4 


7 


100 


5434 


5 X 10 4 


Total/ 10 pH 




3.32 X 10 7 


1.35 X 10 



50 
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Table 5B 

5 





Ad4 Fraction t 


VOLUME/ul 


LFU/Ul 


PFU/Ul 




1 


100 


1000 


1.75 X 10 5 


10 


2 


^ Art 

100 


X » / 7 X XV 


2-8 X 10 5 






100 


1.8 X 10 4 


5.5 X 10 4 


XD 


"» 


100 


2909 


1.25 X 10 4 




5 


100 


920 


4 X 10 4 




g 


100 


153 


3 X 10 3 


20 


Total/ 10 pn 




A v 1 
•m X. XV 


5.6 X 10 7 


25 


Xd5 Fraction # 










1 


120 


1.98 X 10 4 


6 X 10 6 




2 


100 


5.8 x 10 4 


3 X 10 6 


30 


3 


100 


1.2 X 10 5 


1.5 X 10 6 




4 


100 


1 X 10 5 


1.4 X 10 5 


35 


5 


100 


7.96 X 10 4 


8 X 10 4 




6 


100 


6860 


6 X 10 4 




Total/ 10 pH 




3.88 X 10 7 


1.2 X 10 9 
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10 



15 



20 



25 



30 



35 



Ad7 Fraction # 



Table 5C 
VOLUME/Ul LFU/Ul 



PFU/Ul 



1 


100 


1225 


5 


X 


10 s 


2 


100 


5550 


4 


X 


10 5 


3 


100 


4938 


2 


X 


10 5 


4 


100 


3866 


8 


X 


10 4 


5 


100 


4134 


6 


X 


10 4 


6 


100 


995 


7 


X 


10 4 


/ 


i on 


230 


5 


JJ 


10 3 


Total/ 10 pH 




2.09 X 10 6 


1.3 


X 


10 8 


Adl2 Fraction # 












1 


100 


31 


5 


X 


10 5 


2 


80 


169 


8.5 


X 


10 5 


3 


80 


245 


1.8 


X 


10 s 


4 


110 


161 


1.1 


X 


10 s 


5 


120 


62 


7 


X 


10 3 


Total/ 10 pH 




6.14 X 10 4 


1.65 


X 


10 8 
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Table 5D 



5 



10 



15 



20 



Ad40 Fraction # 


VOLUME/Ul 


LFU/Ul 


PFU/Ul 


l 


80 


61 


5 


2 


80 


184 


3 


3 


80 


199 


3 


4 


80 


168 


1 


5 


80 


122 




6 


100 


46 




7 


100 


32 




Total/ 10 pH 




6.65 x 10 4 


1.1 X 10 3 



C. Characterization of the stru cture of Packaged 
25 Viruses 

Aliquots of serial fractions were analysed by 
Southern blots using lacZ as a probe. In the case of Ad: 
and 5, not only the linearized monomer was packaged but 
multiple forms of recombinant virus with distinct sizes 

30 were found. These forms correlated well with the sizes 
of dimers, trimers and other higher molecular weight 
concatamers. The linearized monomers peaked closer to 
the top of tube (the defective adenovirus band) than 
other forms. When these forms were correlated with lacZ 

35 activity, a better correlation was found between the 

higher molecular weight forms than the monomers. With 
pseudotyping of Ad4 and Ad7, no linearized monomers were 
packaged and only higher molecular weight forms were 
found. 

40 These data definitively demonstrate the 

production and characterization of the A virus and the 
different ps udotypes. This xample illustrates a very 
simple way of generating pseudotype virus s. 
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Example 11 - AdA Vector Containing a FH Gene 

Familial hypercholesterolemia (FH) is an autosomal 
dominant disorder caused by abnormalities (deficiencies) 
in the function or expression of LDL receptors [M.S. 
5 Brown and J.L. Goldstein, Scifc.ice . 222(4746) : 34-37 
(1986); J.L. Goldstein and M.S. Brown, "Familial 
hypercholesterolemia" in Metabolic Basis of Inherited 
Disease. . ed. C.R. Scriver et al, McGraw Hill, New York, 
ppl2l5-1250 (1989).] Patients who inherit one abnormal 

10 allele have moderate elevations in plasma LDL and suffer 
premature life-threatening coronary artery disease (CAD) . 
Homozygous patients have severe hypercholesterolemia and 
life-threatening GAD in childhood. An FH-containing 
vector of the invention is constructed by replacing the 

15 lacZ minigene in the pAdAc.CMVlacZ vector with a minigene 
containing the LDL receptor gene [T. Yamamoto et al, 
Cell . 12:27-38 (1984)] using known techniques and as 
described analogously for the dystrophin gene and CFTR in 
the preceding examples. Vectors bearing the LDL receptor 

20 gene can be readily constructed according to this 

invention. The resulting plasmid is termed pAdAc.CMV- 
LDL. 

This plasmid is useful in gene therapy of FH alone, 
or preferably, in the form of a conjugate prepared as 
25 described herein to substitute a normal LDL gene for the 
abnormal allele responsible for the gene. 
A. Ex Vivo Gene Therapy 

Ex vivo gene therapy can be performed by 
harvesting and establishing a primary culture of 
30 hepatocytes from a patient. Known techniques may be used 
to isolate and transduce the hepatocytes with the above 
vector (s) bearing the LDL receptor gene(s). For example, 
techniques of collag nas perfusion developed for rabbit 
liver can be adapted for human tissu and used in 
35 transducti n. Following transduction, th hepat cytes 
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are removed from the tissue culture plates and reinfused 
into the patient using known techniques, e.g. via a 
catheter placed into the inferior mesenteric vein. 
B. Tn Vivo Gene Therapy 
5 Desirably, the in vi.o approach to gene 

therapy, e.g. liver-directed, involves the use of the 
vectors and vector conjugates described above. A 
preferred treatment involves infusing a vector LDL 
conjugate of this invention into the peripheral 
10 circulation of the patient. The patient is then 

evaluated for change in serum lipids and liver tissues. 

The virus or conjugate can be used to infect 
hepatocytes in vivo by direct injection into a peripheral 
or portal vein (10 7 -10 8 pfu/kg) or retrograde into the 
15 biliary tract (same dose) . This effects gene transfer 
into the majority of hepatocytes. 

Treatments are repeated as necessary, e.g. 
weekly. Administration of a dose of virus equivalent to 
an MOI of approximately 20 (i.e. 20 pfu/hepatocyte) is 
20 anticipated to lead to high level gene expression in the 
majority of hepatocytes. 

All references recited above are incorporated herein 
by reference. Numerous modifications and variations of 
the present invention are included in the above- 
25 identified specification and are expected to be obvious 
to one of skill in the art. Such modifications and 
alternations to the compositions and processes of the 
present invention, such as various modifications to the 
PAC sequences or the shuttle vectors, or to other 
30 sequences of the vector, helper virus and minigene 

components, are believed to be encompassed in the scope 
of the claims appended hereto. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Trustees of the University of Pennsylvania 

Wilson, James M. 
Fisher, Krishna J. 
Chen, Shu- Jen 
Weitzman, Matthew 

(ii) TITLE OF INVENTION: Improved Adenovirus and Methods 

of Use Thereof 

(iii) NUMBER OF SEQUENCES: 10 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Hovson and Hovson 

(B) STREET: Spring House Corporate Cntr, PO Box 457 

(C) CITY: Spring House 

(D) STATE: Pennsylvania 

(E) COUNTRY: USA 

(F) ZIP: 19477 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/331,381 

(B) FILING DATE: 28-OCT-1994 

(Viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Bak, Mary E. . 

(B) REGISTRATION NUMBER: 31,215 

(C) REFERENCE/DOCKET NUMBER: GNVPN. 008PCT 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 215-540-9200 

(B) TELEFAX: 215-540-5818 
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(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7897 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 





AGCTGAAGCT TGAATTCCAT CATCAATAAT ATACCTTATT 


50 


J. 1 bun 1 AAj nn 


GCCAATATGA TAATGAGGGG GTGGAGTTTG TGACGTGGCG 


100 


UwvvvwU X ww 


GAACGGGGCG GGTGACGTAG GTTTTAGGGC GGAGTAACTT 


150 


fST & TGVGTTG 

UlAXAJlUl ■*• w 


GGAATTGTAG TTTTCTTAAA ATGGGAAGTT ACGTAACGTG 


200 


r+r+ i\ Ji TV lkfVI/T2k 


AGTGACGATT TGAGGAAGTT GTGGGTTTTT TGGCTTTCGT 


250 


llUl wwwww J. 


AGGTTCGCGT GCGGTTTTCT GGGTGTTTTT TGTGGACTTT 


300 


AACCGTTACG 


TCATTTTTTA GTCCTATATA TACTCGCTCT GCACTTGGCC 


350 


CIT'ITTTACA 


CTGTGACTGA TTGAGCTGGT GCCGTGTCGA GTGGTGTTTT 


400 


TTTAATAGGT 


TTTCTTTTTT ACTGGTAAGG CTGACTGTTA GGCTGCCGCT 


450 


GTGAAGCGCT 


GTATGTTGTT CTGGAGCGGG AGGGTGCTAT TTTGCCTAGG 


500 


CAGGAGGGTT 


TTTCAGGTGT TTATGTGTTT TTCTCTCCTA TTAATTTTGT 


550 


TATACCTCCT 


ATGGGGGCTG TAATGTTGTC TCTACGCCTG CGGGTATGTA 


600 


TTCCCCCCAA 


GCTTGCATGC CTGCAGGTCG ACTCTAGAGG ATCCGAAAAA 


650 


ACCTCCCACA 


CCTCCCCCTG AACCTGAAAC ATAAAATGAA TGCAATTGTT 


700 


GTTGTTAACT 


TGTTTATTGC AGCTTATAAT GQTTACAAAT AAAGCAATAG 


750 


CATCACAAAT 


TTCACAAATA AAGCATTTTT TTCACTGCAT TCTAGTTGTG 


800 


GTTTGTCCAA 


ACTCATCAAT GTATCTTATC ATGTCTGGAT CCCCGCGGCC 


850 


GCCTAGAGTC 


GAGGCCGAGT TTGTCAGAAA GCAGACCAAA CAGCGGTTGG 


900 


AATAATAGCG AGAACAGAGA AATAGCGGCA AAAATAATAC CCGTATCACT 


950 


TTTGCTGATA 


TGGTTGATGT CATGTAGCCA AATCGGGAAA AACGGGAAGT 


1000 
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AGGCTCCCAT 


GATAAAAAAG 


TAAAAGAAAA 


AGAATAAACC 


GAACATCCAA 


1050 


AAGTTTGTGT 


TTTTTAAATA 


GTACATAATG 


GATTTCCTTA 


CGCGAAATAC 


1100 


GGG CAGACAT 


GGCCTGCCCG 


GTTATTATTA 


TTTTTGACAC 


CAGACCAACT 


1150 


GGTAATGGTA 


GCGACCGGCG 


CTCAGCTGTA 


AxTCCGCCGA 


TACTGACGGG 


1200 


CTCCAGGAGT 


CGTCGCCACC 


AATCCCCATA 


TGGAAACCGT 


CGATATTCAG 


1250 


CCATGTGCCT 


TCTTCCGCGT 


GCAGCAGATG 


GCGATGGCTG 


CTTTCCATCA 


1300 


GTTGCTGTTG 


ACTGTAGCGG 


CTGATGTTGA 


ACTGGAAGTC 


GCCGCGCCAC 


1350 


TGGTGTGGGC 


CATAATTCAA 


TTCGCGCGTC 


CCGCAGCGCA 


GACCGTTTTC 


1400 


GCTCGGGAAG 


ACGTACGGGG 


TATACATGTC 


TGACAATGGC 


AGATCCCAGC 


1450 


GGTCAAAACA 


GGCGGCAGTA 


AGGCGGTCGG 


GATAGTTTTC 


TTGCGGCCCT 


1500 


AATCCGAGCC 


AGTTTACCCG 


CTCTGCTACC 


TGCGCCAGCT 


GGCAGTTCAG 


1550 


GCCAATCCGC 

WW>JW 


GCCGGATGCG 


GTGTATCGCT 

«7 X ^* X«»X X 


CGCCACTTCA 


ACATCAACGG 


1600 


XAAl WwWWAX 


TTGACCACTA 


CCATCAATCC 


GGTAGGTTTT 

WWXAW^7X XXX 


CCGGCTGATA 


1650 


AATAAGGTTT 


TCCCCTGATG 


CTGCCACGCG 

\* X W W W**W\* W»* 


TGAGCGGTCG 


TAATCAGCAC 


1700 


CGCATCAGCA 


AGTGTATCTG 

*IVJ X W X • » X W X w 


CCGTGCACTG 


CAACAACGCT 


GCTTCGGCCT 


1750 


GGTAATGGCC 


WWW W\» W W X X >*• 


CAGCGTTCGA 


CCCAGGCGTT 


AGGGTCAATG 


1800 


wwww X Ww W X X 


CACITACGCC 
wnwx xnwwww 


AATGTCCITA 

An X W X WW X X *» 


TCCAGCGGTG 


CACGGGTGAA 


1850 


w x un X Vv^wi^ 


AGCGGCGTCA 

AU UUVIVrU X WA 


GCACFTGITT 

W VAU X X W XXX 


TTTATCGC CA 

x x xn x \*\*ww*» 


ATCCACATCT 


1900 


GTGAAAGAAA 




CGGTTAAATT 


GCCAACGCTT 


ATTACCCAGC 


1950 


TCGATGCAAA 


AATCCATTTC 


GCTGGTGGTC 


AGATGCGGGA 


TGGCGTGGGA 


2000 


CGCGGCGGGG 


AGCGTCAGAC 


TGAGGTTTTC 


CGCCAGACGC 


CACTGCTGCC 


2050 


AGGCGCTGAT 


GTGCCCGGCT 


TCTGACCATG 


CGGTCGCGTT 


CGGTTGCACT 


2100 


ACGCGTACTG 


TGAGCCAGAG 


TTGCCCGGCG 


CTCTCCGGCT 


GCGGTAGTTC 


2150 


AGGCAGTTCA 


ATCAACTGTT 


TACCTTGTGG 


AGCGACATCC 


AGAGGCACTT 


2200 


CACCGCTTGC 


CAGCGGCTTA 


CCATCCAGCG 


CCACCATCCA 


GTGCAGGAGC 


2250 


TCGTTATCGC 


TATGACGGAA 


CAGGTATTCG 


CTGGTCACTT 


CGATGGTTTG 


2300 
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CCCGGATAAA 
GCGCTGGATG 
TGGCGATCGT 
GTTGCCGTTT 
CGAAGCCGCC 
GCGAAACCGC 
CAGCGGGCGC 
TCGGCACAGC 
CAAATAATAT 
CGGGCGGGAA 
TAGCGCCGTG 
TGATTACGAT 
TAGCCAGCGC 
TTTCAATATT 
GTGTACCACA 
GTTGTTCTGC 
TGACCTGACC 
AGCAACGGCT 
GCGGAAACCG 
TGTGCAGTTC 
AGTTTCGGGT 
ACCACGCTCA 
CCTGCGTTTC 
AACTCGCCGC 
ATCATTAAAG 
TATGCAGCAA 



CGGAACTGGA 
CGGCGTGCGG 
TCGGCGTATC 
TCATCATATT 
CTGTAAACGG 
CAAGACTGTT 
GTCTCTCCAG 
CGGGAAGGGC 
CGGTGGCCGT 
GGATCGACAG 
GCCTGATTCA 
CGCGCTGCAC 
GGATCATCGG 
GGCTTCATCC 
GCGGATGGTT 
TTCATCAGCA 
ATGCAGAGGA 
TGCCGTTCAG 
ACATCGCAGG 
AACCACCGCA 
TTTCGACGTT 
TCGATAATTT 
ACCCTGCCAT 
ACATCTGAAC 
CGAGTGGCAA 
CGAGACGTGA 



62 

AAAACTGCTG 
TCGGGAAAGA 
GCCAAAATCA 
TAATCAGCGA 
GGATACTGAC 
ACCCATCGCG 
GTAGCGAAAG 
TGGTCTTCAT 
GGTGTCGGCT 
ATTTGATCCA 
TTCCCCAGCG 
CATTCGCGTT 
TCAGACGATT 
ACCACATACA 
CGGATAATGC 
GGATATCCTG 
TGATGCTCGT 
CAGCAGCAGA 
CTTCTGCTTC 
CGATAGAGAT 
CAGACGTAGT 
GACCGCCGAA 
AAAGAAACTG 
TTCAGCCTCC 
CATGGAAATC 
CGGAAAATGC 



CTGGTGTTTT 
CCAGACCGTT 
CCGCCGTAAG 
CTJATCCACC 
GAAACGCCTG 
TGGGCGTATT 
CCATTTTTTG 
CCACGCGCGC 
CCGCCGCCTT 
GCGATACAGC 
ACCAGATGAT 
ACGCGTTCGC 
CATTGGCACC 
GGCCGTAGCG 
GAAGAGCGCA 
CACCATCGTC 
GACGGTTAAC 
CCATTTTCAA 
AATCAGCGTG 
TCGGGATTTC 
GTGACGCGAT 
AGGCGCGGTG 
TTACCCGTAG 
AGTACAGCGC 
GCTGATTTGT 
CGCTCATCCG 



GCTTCCGTCA 
CATACAGAAC 
CCGACCACGG 
CAGTCCCAGA 
CCAGTATTTA 
CGCAAAGGAT 
ATGGACCATT 
GTACATCGGG 
CATACTGCAC 
GCGTCGTGAT 
CACACTCGGG 
TCATCGCCGG 
ATGCCGTGGG 
GTCGCACAGC 
CGGCGTTAAA 
TGCTCATCCA 
GCCTCGAATC 
TCCGCACCTC 
CCGTCGGCGG 
GGCGCTCCAC 
CGGCATAACC 
CCGCTGGCGA 
GTAGTCACGC 
GGCTGAAATC 
GTAGTCGGTT 
CCACATATCC 



2350 

2400 

2450 

2500 

2550 

2600 

2650 

2700 

2750 

2800 

2850 

2900 

2950 

3000 

3050 

3100 

3150 

3200 

3250 

3300 

3350 

3400 

3450 

3500 

3550 

3600 
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TGATCITCCA 
x wa x w x x w- wa 


GATAACTGCC 


GTCACTCCAA 

V* A W#* W A ^» 


CGCAGCACCA 


TCACCGCGAG 


3650 


ftf^ftCnTTCF 

W w X X X X w X 


CCGGCGCGTA 

WWWV\* X mm 


AAAATGCGCT 

MmwmmXWm X w%*w X 


CAGGTCAAAT 


TCAGACGGCA 


3700 


AACGACTGTC 

^W%w w#*W X X W 


CTGGCCGTAA 

a %j %^ «^ An** 


CCGACCCAGC 


GCCCGTTGCA 


CCACAGATGA 


3750 


AACGCCGAGT 


TAACGCCATC 


AAAAATAATT 


CwCGTCTGGC 


CTTCCTGTAG 


3800 


CCAGCTTTCA 


TCAACATTAA 


ATGTGAGCGA 


GTAACAACCC 


GTCGGATTCT 


3850 


CCGTGGGAAC 

Www X wWWX&AW 


AAACGGCGGA 


TTGACCGTAA 


TGGGATAGGT 


TACGTTGGTG 


3900 


TAGATGGGCG 


CATCGTAACC 


GTGCATCTGC 


CAGTTTGAGG 


GGACGACGAC 


3950 


AGTATCGGCC 


TCAGGAAGAT 


CGCACTCCAG 


CCAGCTTTCC 


GGCACCGCTT 


4000 


OTGGTGCCGG 

W X X w W wA7 


AAACCAGGCA 

*>CT*»^^ ^*ATm^AT*mi ^mm» 


AAGCGCCATT 


CGCCATTCAG 


GCTGCGCAAC 


4050 


TGTTGGGAAG 


GGCGATCGGT 


GCGGGCCTCT 


TCGCTATTAC 


GCCAGCTGGC 


4100 




TftTftOTGCAA 

X w A ww X W wAA 


GGCGATTAAG 

WWwWAX XAAW 


TTGGGTAACG 

X XwwwXA*»ww 


CCAGGGTTTT 

%^ W>*»^V^^«M A A A A 


4150 


w w wAw X wAUu 


ALoX luiAAA 


A f^ft A f*ftftft A*P 

A wVaA w bW3A X 


PftPftPTTftAft 

wWwWwX IwAU 


wAW w X ww X X W 


4200 




ftArv^AATftrv* 

VulV wAA X Www 


TWV , AftAPf , ft 
X w w wAwAw wW 


ftPAAPftAAAA 

WwAAwWAAAA 


TO ACftTTCPT 

X wAw W X X w X X 


4250 


ulluulwiAA 


ftTA AAr*ftAOA 

W X AAAwWAwA 


fftft fft & WPP 
luVal WaVX X w 


X X X X X X Ww A X 


TAGCAGGCTC 


4300 


XXX LbAl www 


LuuuAAl Xww 


ftftf^ft^ftftftT 


AwAAX X wwWw 


AWwX X X X AWA 


4350 


uvAuAAu X AA 


wAwX X wwwXA 


PAftftPlTAftA 
UAwaCt XAuA 


AftTA A 

Aw X AAAWW wA 


vwAv X w 


4400 


AvjuAuWiul X 


vXX XuAX XXu 


VJAwwAwwAww 


ft/2 & Vr'Pftftft & 
WW AX Ww WWW A 


WW A WAAAX AA 


4450 


AAbAUAAAAA 


wAwXAAAv-X X 


AfVAfiTA ir* 

AwwAwX XAAw 


X X X w 1\JU XXX 


X X W%W^ X X ww X 


4500 


UuAu X Av wWW 


AX ww X w X AwA 


ftTWftftAftftf 1 


X WWaX wWW X w 


CCGGTCTCTfP 

W ww^* X %^ X w X A 


4550 


CTATGGAGGT 


CAAAACAGCG 


TGGATG6C6T 


CTCCAGGCGA 


TCTGACGGTT 


4600 


CACTAAACGA 


GCTCTGCTTA 


TATAGACCTC 


CCACCGTACA 


CGCCTACCGC 


4650 


CCATTTGCGT 


CAATGGGGCG 


GAGTTGTTAC 


GACATTTTGG 


AAAGTCCCGT 


4700 


TGATTTTGGT 


GCCAAAACAA 


ACTCCCATTG 


ACGTCAATGG 


GGTGGAGACT 


4750 


TGGAAATCCC 


CGTGAGTCAA 


ACCGCTATCC 


ACGCCCATTG 


ATGTACTGCC 


4800 


AAAACCGCAT 


CACCATGGTA 


ATAGCGATGA 


CTAATACGTA 


GATGTACTGC 


4850 


CAAGTAGGAA 


AGTCCCATAA 


GGTCATGTAC 


TGGGCATAAT 


GCCAGGCGGG 


4900 
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CCATTTACCG TCATTGACGT CAATAGGGGG CGTACTTGGC ATATGATACA 4950 

CTTGATGTAC TGCCAAGTGG GCAGTTTACC GTAAATACTC CACCCATTGA 5000 

CGTCAATGGA AAGTCCCTAT TGGCGTTACT ATGGGAACAT ACGTCATTAT 5050 

TGACGTCAAT GGGCGGGGGT CGTTGGGCGG TC*.GCCAGGC GGGCCATTTA 5100 

CCGTAAGTTA TGTAACGACC TGCAGGTCGA CTCTAGAGGA TCTCCCTAGA 5150 

CAAATATTAC 6CGCTAT6AG TAACACAAAA TTATTCAGAT TTCACTTCCT 5200 

CTTATTCAGT TTTCCCGCGA AAATGGCCAA ATCTTACTCG GTTACGCCCA 5250 

AATTTACTAC AACATCCGCC TAAAACCGCG CGAAAATTGT CACTTCCTGT 5300 

GTACACCGGC GCACACCAAA AACGTCACTT TTGCCACATC CGTCGCTTAC 5350 

ATGTGTTCCG CCACACTTGC AACATCACAC TTCCGCCACA CTACTACGTC 5400 

ACCCGCCCCG TTCCCACGCC CCGCGCCACG TCACAAACTC CACCCCCTCA 5450 

TTATCATATT GGCTTCAATC CAAAATAAGG TATATTATTG ATGATGCTAG 5500 

CGAATTCATC GATGATATCA GATCTGCCGG TCTCCCTATA GTGAGTCGTA 5550 

TTAATTTCGA TAAGCCAGGT TAACCTGCAT TAATGAATCG GCCAACGCGC 5600 

GGGGAGAGGC GGTTTGCGTA TTGGGCGCTC TTCCGCTTCC TCGCTCACTG 5650 

ACTCGCTGCG CTCGGTCGTT CGGCTGCGGC GAGCGGTATC AGCTCACTCA 5700 

AAGGCGGTAA TACGGTTATC CACAGAATCA GGGGATAACG CAGGAAAGAA 5750 

CATGTGAGCA AAAGGCCAGC AAAAGGCCAG GAACCGTAAA AAGGCCGCGT 5800 

TGCTGGCGTT TTTCCATAGG CTCCGCCCCC CTGACGAGCA TCACAAAAAT 5850 

CGACGCTCAA GTCAGAGGTG GCGAAACCCG ACAGGACTAT AAAGATACCA 5900 

GGCGTTTCCC CCTGGAAGCT CCCTCGTGCG CTCTCCTGTT CCGACCCTGC 5950 

CGCTTACCGG ATACCTGTCC GCCTTTCTCC CTTCGGGAAG CGTGGCGCTT 6000 

TCTCAATGCT CACGCTGTAG GTATCTCAGT TCGGTGTAGG TCGTTCGCTC 6050 

CAAGCTGGGC TGTGTGCACG AACCCCCCGT TCAGCCCGAC CGCTGCGCCT 6100 

TATCCGGTAA CTATCGTCTT GAGTCCAACC CGGTAAGACA CGACTTATCG 6150 

CCACTGGCAG CAGCCACTGG TAACAGGATT AGCAGAGCGA GGTATGTAGG 6200 



# 
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65 

CGGTGCTACA GA6TTCTTGA AGTGGTGGCC 
GGACAGTATT TGGTATCTGC GCTCTGCTGA 
AGAGTTGGTA GCTCTTGATC CGGCAAACAA 
TTTTTTTGTT TGCAAGCAGC AGATTACGCG 
AAGATCCTTT GATCTTTTCT ACGGGGTCTG 
TCACGTTAAG GGATTTTGGT CATGAGATTA 
GATCCTTTTA AATTAAAAAT GAAGTTTTAA 
AGTAAACTTG GTCTGACAGT TACCAATGCT 
TCAGCGATCT GTCTATTTCG TTCATCCATA 
GTAGATAACT ACGATACGGG AGGGCTTACC 
TGATACCGCG AGACCCACGC TCACCGGCTC 
CAGCCAGCCG GAAGGGCCGA GCGCAGAAGT 
CTCCATCCAG TCTATTAATT GTTGCCGGGA 
CAGTTAATAG TTTGCGCAAC GTTGTTGCCA 
TCACGCTCGT CGTTTGGTAT GGCTTCATTC 
AAGGCGAGTT ACATGATCCC CCATGTTGTG 
TCGGTCCTCC GATCGTTGTC AGAAGTAAGT 
ATGGTTATGG CAGCACTGCA TAATTCTCTT 
ATGCTTTTCT GTGACTGGTG AGTACTCAAC 
GTATGCGGCG ACCGAGTTGC TCTTGCCCGG 
GCGCCACATA GCAGAACTTT AAAAGTGCTC 
GGGGCGAAAA CTCTCAAGGA TCTTACCGCT 
AACCCACTCG TGCACCCAAC TGATCTTCAG 
GTTTCTGGGT GAGCAAAAAC AGGAAGGCAA 
AAGGGCGACA CGGAAATGTT GAATACTCAT 
ATTGAAGCAT TTATCAGGGT TATTGTCTCA 



PCT/US95/14017 



TAAPTAPGGC 


TACACTAGAA 


6250 


ACPPAGTTAC 

AW*V*AwX X AS* 


CTTCGGAAAA 


6300 


ACCACCGCTG 


CTAGCGGTGG 


6350 


CmGAAAAAAA 

\*aw.aa#****»#»*» 


GGATCTCAAG 


6400 


ACGCTCAGTG 


GAACGAAAAC 


6450 


TCAAAAAGGA 


TCTTCACCTA 


6500 


ATPAATCTAA 

AX WUli \*X A*» 


AGTATATATG 


6550 


T A ATP AGTG A 

X AAX VAwlUA 


GGCACCTATC 


6600 


Ull V7w^ X WAw 


TCCCCGTCGT 


6650 


a wr*rfifzr t { % cc* 

Alvl uuuuwv^ 


AGTGPTGCAA 

Aw X WW X V WAA 


6700 


wiVmx x 1A1 V* 


A CPA AT A AAP 
AvWAA X AAAw 


6750 




Li X 1A1 


6fi00 

WO w V 


AGCTAGAGTA 


AGTAb 1 X wl» w 




TTGCTACAGG 




V 7 w u 


AuC X ILw x x 


PPP A A PC ATP 


6950 


CAAAAAAbw 


Va X lAu^ X X 


7000 




tf21TATPAPTP 


7050 


ALlblwilvv 


PATPPGTAAG 
WAX x /in\j 


7100 


PA AflTP ITTf* 
UAAulVoAl X W 


TGAGAATAGT 

x unvjiin x aw x 


7150 


CGTCAATACG 


GGATAATACC 


7200 


ATCATTGGAA 


AACGTTCTTC 


7250 


GTT6AGATCC 


AGTTCGATGT 


7300 


CATCTTTTAC 


TTTGACCAGC 


7350 


AATGCCGCAA 


AAAAGGGAAT 


7400 


ACTCTTCCTT 


TTTCAATATT 


7450 


T6AGCGGATA 


CATATTTGAA 


7500 
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TGTATTTAGA AAAATAAACA AATAGGGGTT 
AGTGCCACCT GACGTCTAAG AAACCATTAT 
AAAATAGGCG TATCACGAGG CCCTTTCGTC 
GGTGAAAACC TCTGACACAT GCAGCTCCCG 
GTAAGCGGAT GCCGGGAGCA GACAAGCCCG 
TTGGCGGGTG TCGGGGCTGG CTTAACTATG 
CTGAGAGTGC ACCATATGGA CATATTGTCG 
ATACATAACC TTATGTATCA TACACATACG 



CCGCGCACAT TTCCCCGAAA 7550 

TATCATGACA TTAACCTATA 7600 

TCGCGCGTTT CGGTGATGAC 7650 

GAJACGGTCA CAGCTTGTCT 7700 

TCAGGGCGCG TCAGCGGGTG 7750 

CGGCATCAGA GCAGATTGTA 7800 

TTAGAACGCG GCTACAATTA 7850 

ATTTAGGTGA CACTATA 7897 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7852 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



GAATTCGCTA 


GCTAGCGGGG GAATACATAC CCGCAGGCGT AGAGACAACA 


50 


TTACAGCCCC 


CATAGGAGGT ATAACAAAAT TAATAGGAGA GAAAAACACA 


100 


TAAACACCTG 


AAAAACCCTC CTGCCTAGGC AAAATAGCAC CCTCCCGCTC 


150 


CAGAACAACA 


TACAGCGCTT CACAGCGGCA GCCTAACAGT CAGCCTTACC 


200 


AGTAAAAAAG 


AAAACCTATT AAAAAAACAC CACTCGACAC GGCACCAGCT 


250 


CAATCAGTCA 


CAGTGTAAAA AAGGGCCAAG TGCAGAGCGA GTATATATAG 


300 


GACTAAAAAA TGACGTAACG GTTAAAGTCC ACAAAAAACA CCCAGAAAAC 


350 


CGCACGCGAA 


CCTACGCCCA GAAACGAAAG CCAAAAAACC CACAACTTCC 


400 


TCAAATCGTC 


ACTTCCGTTT TCCCACGTTA CGTAACTTCC CATTTTAAGA 


450 


AAACTACAAT 


TCCCAACACA TACAAGTTAC TCCGCCCTAA AACCTACGTC 


500 


ACCCGCCCCG 


TTCCCACGCC CCGCGCCACG TCACAAACTC CACCCCCTCA 


550 
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GGCTTCAATC 

WW X X WAA X W 


CAAAATAAGG 

WAAAAX AAww 


TATATTATTG 


ATGATGCTAG 


600 


CATCATCAAT 

WAX WAX wAAX 


A A*F A *P A PPTT 

AAXAXAWWX X 


&'I"I'*I"'I'CCA'1 U P 
AX X X X WW AX X 


CAACPPAATA 

wAAwW WAA X A 


TGATAATGAG 

X wXAX#W%X Wlw 


650 


\JVJ\J\J X UUAU X 


X iUl UilvU X w 


uuuvuuvjvrvv 


fPGGCAACGGG 

X wwwAAWwww 


GCGGGTGACG 

WW www X wAWw 


700 


XAwiAu X la X w 


UUivAAU ±\9 X 


catcttcpa a 

VAX Ui luVAA 


cs ctccpcga 

w X w X www ww A 


APAPATGTAA 

AWAWA X w X AA 


750 


UvU/i^VjuA X w 


tcgpaaa act 

XVwWAAAAwX 


VlAvU X X X X X w 


ctctgpcppc 

w X w X w www Ww 


GTGTAPAPAG 
w x w x rtwiwnw 


800 


GAACTGAPAA 

wAAw X wA WAA 


X X X X WwW** Ww 


GTTTTAGGPC 

WX X X XAwwWw 


GATGTTGTAG 

Wl X W X X X J» \* 


TAAATTTGGG 


850 


fWPAAPPCAG 

Ww X AAW WwAw 


TA ACA'P'I'I'CC 

X AAV3AX X Xww 


WWAX XXX Www 


GGGAAAAPTG 

wwwAAAAWX w 


AATAAGAGGA 

X \*Wt#* 


900 


AGTGAAATPT 

AV X 19 AAA X W X 


UAAX AAX XXX 


wXwX XAWX WA 


TAGCGCGTAA 

X AwWwWwX AA 


TATTTGTCTA 

XXIX X X wX WX** 


950 




PPTCPAGGTP 

WW XU WAW X w 


GTTAPATAAP 

wX XAWAXAAW 


TTACGGTAAA 

X XAwwwX AAA 


TGGCCCGCCT 

X w\* W W WWW W X 


1000 


UUU X VJAWWwW 


PPAAPCAPPP 

W WAA W WAW W W 


PPCPPPATTC 

WW WWW WAX Xw 


APGTPAATAA 

AWw X WAAX AA 


TGACGTATGT 

X wA W w X A X V X 


1050 


TpppAfpACTA 

X ww waX nul a 


apcpp a atac 

A WW W WAA X Aw 


UwAW XXX W WA 


•PTC A PCTP A A 

X X wA Ww X WAA 


TGGGTGGAGT 

XwwwX wwAw X 


1100 


Ax I IntbGXA 


AAw 1 ViUtUit 


X IwvAu X Aw 


AlUlAuiulA 


TPETaTAPPA 
XwaX Ai wwwa 


1 1 

1X?U 


AGTACGCCCC 


CTATTGACGT 


ik ik m^ ik m^/hii 
waATGACGGT 


AAATwwCCCw 


ww IuuLa X x A 


l£ UU 


TGCCCAGTAC 


% m/* ik ^*^mm% m 

ATGACCTTAT 


GGGACTTTCC 


TACTTGGCAG 


TACATCTACG 




mxmmx f*m/*»ik m 

TATTAGTCAT 


f*/* i ik mm ik no 
CGCTAxXACC 


ATGGTGATwC 




wXAwAXwaax 


i inn 


****** /^/^tfy/^ ik m 
GGG CGTGGAT 


ik ^* o/*^nwnmr ik 
AGCGGlx 1'GA 


CTCACGGGGA 


TTTCCAAGTC 


1 w wAwwwwax 




TGACGTCAAT 


GGGAGTTTGT 


TTTGGCACCA 


ik & ikm/* a & ft*/* 
AAATCAACuu 


wAw XXX wwAa 


i Ann 

X4UU 


% lkm/*m/*/*tnik Ik 

AATGTCGTAA 


CZAACTCCGCC 


CGAx IGACGC 


AAATwVjw uliti 


X AwwWw X\3 X A 


1 A en 


wliw x buunuu 


ItiAiAlAAu 


WAwAw W X Ww X 


TTACTC A A PP 
X X AVJ Xoaaww 


flTPAflATPCP 

W X WAwA X WwW 


1500 


CTGGAGACGC 


CATCCACGCT 


GTTTTGACCT 


CCATAGAAGA 


CACCGGGACC 


1550 


GATCCAGCCT 


CCGGACTCTA 


GAGGATCCGG 


TACTCGAGGA 


ACTGAAAAAC 


1600 


CAGAAAGTTA 


ACTGGTAA6T 


TTAGTCTTTT 


TGTCTTTTAT 


TTCAGGTCCC 


1650 


GGATCCGGTG 


GTGGTGCAAA 


TCAAAGAACT 


GCTCCTCAGT 


GGATGTTGCC 


1700 


TTTACTTCTA 


GGCCTGTACG 


GAAGTGTTAC 


TTCTGCTCTA 


AAAGCTGCGG 


1750 


AATTGTACCC 


GCGGCC6CAA 


TTCCCGGGGA 


TCGAAAGAGC 


CTGCTAAAGC 


1800 


AAAAAAGAAG 


TCACCATGTC 


GTTTACTTTG 


ACCAACAAGA 


ACGTGATTTT 


1850 
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CGTTGCCGGT 
AGCGC6ATCC 
ACCCAACTTA 
TAGCGAAGAG 
ATGGCGAATG 
AGCTGGCTGG 
AAACTGGCAG 
ATCCCATTAC 
TGTTACTCGC 
GACGCGAATT 
ACGGGCGCTG 
GACCTGAGCG 
GCTGCGTTGG 
TGAGCGGCAT 
ATCAGCGATT 
TGTACTGGAG 
GGGTAACAGT 
GCGCCTTTCG 
CGTCACACTA 
TCCCGAATCT 
CTGATTGAAG 
AAATGGTCTG 
ACCGTCACGA 
ATGGTGCAGG 
CTGTTCGCAT 
ACGGCCTGTA 



CTGGGAGGCA 
CGTCGTTTTA 
ATCGCCTTGC 
GCCCGCACCG 
GCGCTTTGCC 
AGTGCGATCT 
ATGCACGGTT 
GGTCAATCCG 
TCACATTTAA 
ATTTTTGATG 
GGTCGGTTAC 
CATTTTTACG 
AGTGACGGCA 
TTTCCGTGAC 
TCCATGTTGC 
GCTGAAGTTC 
TTCTTTATGG 
GCGGTGAAAT 
CGTCTGAACG 
CTATCGTGCG 
CAGAAGCCTG 
CTGCTGCTGA 
GCATCATCCT 
ATATCCTGCT 
TATCCGAACC 
TGTGGTGGAT 



TTGGTCTGGA CACCAGCAAG 
CAACGTCGTG ACTGGGAAAA 
AGCACATCCC CCTTTCGCCA 
ATCGCCCTTC CCAACAGTTG 
TGGTTTCCGG CACCAGAAGC 
TCCTGAGGCC GATACTGTCG 
ACGATGCGCC CATCTACACC 
CCGTTTGTTC CCACGGAGAA 
TGTTGATGAA AGCTGGCTAC 
GCGTTAACTC GGCGTTTCAT 
GGCCAGGACA GTCGTTTGCC 
CGCCGGAGAA AACCGCCTCG 
GTTATCTGGA AGATCAGGAT 
GTCTCGTTGC TGCATAAACC 
CACTCGCTTT AATGATGATT 
AGATGTGCGG CGAGTTGCGT 
CAGGGTGAAA CGGAGGTCGC 
TATCGATGAG CGTGGTGGTT 
TCGAAAACCC GAAACTGTGG 
GTGGTTGAAC TGCACACCGC 
CGATGTCGGT TTCCGCGAGG 
ACGGCAAGCC GTTGCTGATT 
CTGCATGGTC AGGTCATGGA 
GATGAAGCAG AACAACTTTA 
ATCCGCTGTG GTACACGCTG 
GAAGCCAATA TTGAAACCCA 



GAGCTGCTCA 
CCCTGGCGTT 
GCTGGCGTAA 
CGCAGCCTGA 
GGTGCCGGAA 
TCGTCCCCTC 
AACGTAACCT 
TCCGACGGGT 
AGGAAGGCCA 
CTCTGGTGCA 
GTCTGAATTT 
CGGTGATGGT 
ATGTGGCGGA 
GACTACACAA 
TCAGCCGCGC 
GACTACCTAC 
CAGCGGCACC 
ATGCCGATCG 
AGCGCCGAAA 
CGACGGCACG 
TGCGGATTGA 
CGAGGCGTTA 
TGAGCAGACC 
ACGCCGTGCG 
TGCGACCGCT 
CGGCATGGTG 



1900 

1950 

2000 

2050 

2100 

2150 

2200 

2250 

2300 

2350 

2400 

2450 

2500 

2550 

2600 

2650 

2700 

2750 

2800 

2850 

2900 

2950 

3000 

3050 

3100 

3150 
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PPAATGAATP 


nTPTftAPPtfTA 

W X W X UALvwA 


TCATPPfipfip 

liaA X W W w ww W 


X www X Aw www 


PftaTPfcPPfi*. 
WwAX wAwWwA 




APCPGTAAPft 


PflAATfifZTCP 

vvAA X W XW W 


Aw WW WWa X Ww 


X AAX WAWWWw 


A flT/STf! & TP A 
AwXwXwAX WA 




TPTY2PTP/SPT 

X W X WW X WWW X 


AAfV£AATn&A 


X WAwwW WAWw 


/VCTT'E ATP A 
wWwWX AAX WA 


PP: A Ptf2PP.PTP. 
WwAW w W w W X w 


JJvU 






WwAX WW X X WW 


WWW WWWW X w w 


Aw X A XwAAww 




pccpfMACpp 

wwwwwwAWWW 


fSAPAPPAPCC 


PPAPPfSATlT 
W WAW WwAX AX 


T A TTTYSPPPfi 
XAX X X wWWWw 


ATPT & PP PP P 
AXwXAwwwwW 


i Ann 


UVVj X VlWi lUA 


ACAPPAfiPPP 


X X W W WwwW X w 


T/SPPP A A ATP 
X w W w w AAA X w 


PTPP^TP A A & 
wX WWAX WAAA 




AAniuuLi x x 


Pf2 PT A 


A/S^fSAPPP/;f , 
A uAuA Ww Ww W 


wwww X wA X ww 


TTTOfVX 

X X X wWwAAX A 


JOUU 


WW W WWAUVlWU 


ITCfifiTl AP A 
AluviulAAUA 


f2TPTTf2/ip*2/* 

wX WX X wwwww 


X X X WwwXAAA 


X Aw X wwwAww 


*4 CCA 


WW XXX X Ui 


CTATPPPPP.T 


X X AWAwwwWw 


wwX X wwxwxw 


PP APTPPPTP 
ww Aw 1 www X w 


JOUU 


w/\ X WAW x www 


TCATTA A AT a 
XwAX 1AAA1A 


TCAT/2A&AAP 
XwAXwAAAAW 


flflf* A A r»PPPT 
ww WAAw www X 


wwX wwwwXXA 


JOOU 


PfVSPfVSTfiAT 
CwV* uu x wA X 


x HuuuuAlA 


WwWwwAAwwA 


X wwWWAwX X w 


TwTATwAAww 


<J7AA 

J /uu 


ultluulLll 




AwwwwGwaTw 


CAGCGCTGAw 


MM % « M M« » R H 

GGAAGCAAAA 


3/DU 


wawwaGwAGw 


AGTTTTTCCA 


gttccgttta 


TCCGGGCAAA 


CCATCGAAGT 


t a a a 

3800 


GACCAGCGAA 


TACCTGTTCC 


gtcatagcga 


TAACGAGCTC 


CTGCACTGGA 


3850 


ffv* m m/^ m m^m fwj \ 

TGGTGGCGCT 


^•/*x m/*s*mx % s+ 

GGATGGTAAG 


CCGCTGGCAA 


m#*^*ma*m* % Mm 

GCGGTGAAGT 


GCCTCTGGAT 


3900 


wTwGwTwGAw 


AAGGTAAACA 


GTTGATTGAA 


CTGCCTGAAC 


TACCGCAGCC 


3950 


GGAGAGwGww 


M m m m m « j Wll/wil 

GGGCAACTCT 


GGCTCACAGT 


ACGCGTAGTG 


CAACCGAACG 


4000 


wGAwwGGATG 


GTCAGAAGCC 


GGGCACATGA 


GCGCCTGGCA 


GGAGTGGCGT 


4050 


w IwbLwnAA 


A WW 1 tAu Xw X 


wAwwwTwwww 


M MMM MMmMMM 

GwwGwGTwww 


ACGCCATCCC 


41UU 


GCATCTGACC 


ACCAGCGAAA 


TGGATTTTTG 


CATCGAGCTG 


6GTAATAAGC 


4150 


GTTGGCAATT 


TAACCGCCAG 


TCAGGCTTTC 


TTTCACAGAT 


GTGGATTGGC 


4200 


GATAAAAAAC 


AACTGCTGAC 


GCCGCTGCGC 


GATCAGTTCA 


CCCGTGCACC 


4250 


GCTGGATAAC 


GACATTGGCG 


TAAGTGAAGC 


GACCCGCATT 


GACCCTAACG 


4300 


CCTGGGTCGA 


ACGCTGGAAG 


GCGGCGGGCC 


ATTACCAGGC 


CGAAGCAGCG 


4350 


TTGTTGCA6T 


GCACGGCAGA 


TACACTTGCT 


GATGCGGTGC 


TGATTACGAC 


4400 


CGCTCACGCG 


TGGCAGCATC 


AGGGGAAAAC 


CTTATTTATC 


AGCCGGAAAA 


4450 
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CCTACCGGAT 


TGATGGTAGT 


GGTCAAATGG 


CGATTACCGT 


TGATGTTGAA 


4500 


GTGGCGAGCG 


ATACACCGCA TCCGGCGCGG ATTGGCCTGA ACTGCCAGCT 


4550 


GGCGCAGGTA 


GCAGAGCGGG 


TAAACTGGCT 


CGGATTAGGG 


CCGCAAGAAA 


4600 


ACTATCCCGA 


CCGCCTTACT 


GCCGCCTGTT 


TToACCGCTG 


GGATCTGCCA 


4650 


TTGTCAGACA 
iiwi vav» awa 


TGTATACCCC 


GTACGTCTTC 


CCGAGCGAAA ACGGTCTGCG 


4700 




CGCGAATTGA 


ATTATGGCCC 


ACACCAGTGG 


CGCGGCGACT 


4750 


X WAw X X WAA 


CATCAGCCGC 


TACAGTCAAC 


AGCAACTGAT 


GGAAACCAGC 


4800 


CATCGCCATC 

WA X vUUWli. w 


TGCTGCACGC 


GGAAGAAGGC 


ACATGGCTGA 


ATATCGACGG 


4850 


TTTPP AT ATG 

X X X V>WAXAXV 


GGGATTGGTG 


GCGACGACTC 


CTGGAGCCCG 


TCAGTATCGG 


4900 


p/v^x ETT JVC A 

LiUwAAl XA\»A 


GCTGAGCGCC 


GGTCGCTACC 


ATTACCAGTT 


GGTCTGGTGT 


4950 


pa a a a ata at 
WaaaaaX aaX 


AATAACCGGG 


CAGGCGATGT 


CTGCCCGTAT 


TTCGCGTAAG 


5000 


fl A A ATCCATT 
uAnn x w wa x x 


ATGTACTATT 


TAAAAAACAC 


AAACTTTTGG 


ATGTTCGGTT 


5050 


TATTPTTTTT 

X AX X v#x X X X X 


CTTTTACTTT 


TTTATCATGG 


GAGCCTACTT 


CCCGTTTTTC 


5100 


V*\«unl x x vvv\m> 


TACATGACAT 


CAACCATATC 


AGCAAAAGTG 


ATACGGGTAT 


5150 


1A1 111 l«vV# 


GCTATTTCTC 


TGTTCTCGCT 


ATTATTCCAA 


PPGCTGTTTG 

W^W\* X wX X X w 


5200 


(jXCLVjI*! ill* 


TGACAAACTC 


GGCCTCGACT 


CTAGGCGGCC 


GC666GATCC 


5250 


fcP APATflATA 
AuAWl X ua X a 


AGATACATTG 


ATGAGTTTGG 


ACAAACCACA 


ACTAGAATGC 


5300 


A PTC A A A A A A 
Ail X l» aaaaaa 


Aluwl X X AX X 


TGTGAAATTT 


GTGATGCTAT 


TGCTTTATTT 


5350 


CT» APPATTA 
ulAAuUll In 


TAAGCTGCAA 


TAAACAAGTT 


AACAACAACA 


ATTGCATTCA 


5400 


TTTTATGTTT 


CAGGTTCAGG 


GGGAGGTGTG 


GGAGGTTTTT 


TCGGATCCTC 




TAGAGTCGAC 


GACGCGAGGC 


TGGATGGCCT 


TCCCCATTAT 


GATTCTTCTC 


5500 


GCTTCCGGCG 


GCATCGGGAT 


GCCCGCGTTG 


CAGGCCATGC 


TGTCCAGGCA 


5550 


GGTAGATGAC 


GACCATCAGG 


GACAGCTTCA AGGATCGCTC 


GCGGCTCTTA 


5600 


CCAGCCTAAC 


TTCGATCACT 


GGACCGCTGA 


TCGTCACGGC 


GATTTATGCC 


5650 


GCCTCGGCGA 


GCACATGGAA 


CGGGTTGGCA 


TGGATTGTAG 


GCGCCGCCCT 


5700 


ATACCTTGTC 


TGCCTCCCCG 


CGTTGCGTCG 


CGGTGCATGG 


AGCCGGGCCA 


5750 
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CCTCGACCTU 


AA 1 Vivi AACiU U 


CmCGGGACCT 


MMMfnm m mmm ■» 

CGCTAACGGA 


i i CA CCAt7± C 


CQAA 


CAAGAATTGG 


AGCCAATCAA 


TTCTTGCGGA 


M% % MfflMfflMm — 

GAACTGTGAA 


P/1M MM m m * 1MM 

TGCGGAAACC 




AACCCTTGGC 


AGAACATATC 


Mm W1MM MMHWMM 

CATCGCGTCC 


MMMm fflMnrnMH 

GCCATCTCCA 


GCAGCCGCAC 


59UO 


m mm 4^ a mM 

GCGGCGCATC 


TCGGGCAGCG 


TTGGGTCCTG 


a l/« % ^*MM.M#Vt^V 

GJCACGGGTG 


mm m m iit#< m vnMM 

CGCATGATCG 


5990 


TGCTCCTGTC 


GTTGAGGACC 


mmm Mmm MM MfM 

CGGCTAGGCT 


GGCGGGGTTG 


CCTTACTGGT 


ouuu 


TAGCAGAATG 


m % % MMM m nri 

AATCACCGAT 


m ^«MMM m MMM m 

ACGCGAGCGA 


m MiMfTCM m m mmm 

ACGTGAAGCG 


ACTGCTGCTG 


6050 


CAAAACGTCT 


GCGACCTGAG 


CAACAACATG 


AATGGTCTTC 


GGTTTCCGTG 


4 AA 

6100 


TTTCGTAAAG 


TCTGGAAACG 


*m » m M>mMi m ^» 

CGGAAGTCAG 


CGCCCTGCAC 


CATTATGTTC 


6150 


CGGATCTGCA 


Aft «**k 4k "4P* *W 

TCGCAGGATG 


CTGCTGGCTA 


CCCTGTGGAA 


~m m -vat -**_#bm 4k «*4i^0n**4i 

CACCTACATC 


6200 


TGTATTAACG 


AAGCCTTTCT 


CAATGCTCAC 


GCTGTAGGTA 


TCTCAGTTCG 


6250 


GTGTAGGTCG 


ftwnMMMmMM% % 

TTCGCTCCAA 


M MJW1MMM MfDMIII 

GCTGGGCTGT 


GTGCACGAAC 


mmmmmm nnMn^^ *. 

CCCCCGTTCA 


6300 


GCCCGACCGC 


TGCGCCTTAT 


CCGGTAACTA 


TCGTCTTGAG 


■Hm»mM« m *n -«tftM*4i4*4lJfm 

TCCAACCCGG 


6350 


TAAGACACGA 


CTTATCGCCA 


CTGGCAGCAG 


CCACTGGTAA 


CAGGATTAGC 


6400 


AGAGCGAGGT 


ATGTAGGCGG 


TGCTACAGAG 


TTCTTGAAGT 


AMI m 4k 4k 

GGTGGCCTAA 


6450 


CTACGGCTAC 


ACTAGAAGGA 


CAGTATTTGG 


TATCTGCGCT 


CTGCTGAAGC 


6500 


CAGTTACCTT 


CGGAAAAAGA 


GTTGGTAGCT 


CTTGATCCGG 


CAAACAAACC 


6550 


ACCGCTGGTA 


GCGGTGGTTT 


TTTTGTTTGC 


AAGCAGCAGA 


TTACGCGCAG 


6600 


AAAAAAAGGA 


TCTCAAGAAG 


ATCCTTTGAT 


CTTTTCTACG 


GGGTCTGACG 


6650 


CTCAGTGGAA 


CGAAAACTCA 


CGTTAAGGGA 


TTTTGGTCAT 


m a ji m mmh 4k m ^m4k 

GAGATTATCA 


6700 


AAAAGGATCT 


TCACCTAGAT 


CCTTTTAAAT 


TAAAAATGAA 


GTTTTAAATC 


6750 


AATCTAAAGT 


ATATATGAGT 


AAACTTGGTC 


TGACAGTTAC 


CAATGCTTAA 


6800 


TCAGTGAGGC 


ACCTATCTCA 


GCGATCTGTC 


TATTTCGTTC 


ATCCATAGTT 


6850 


GCCTGACTCC 


CCGTCGTGTA 


GATAACTACG 


ATACGGGAGG 


GCTTACCATC 


6900 


TGGCCCCAGT 


GCTGCAATGA 


TACCGCGAGA 


CCCACGCTCA 


CCGGCTCCAG 


6950 


ATTTATCAGC 


AATAAACCAG 


CCAGCCGGAA 


GGGCCGAGCG 


CAGAAGTGGT 


7000 


CCTGCAACTT 


TATCCGCCTC 


CATCCAGTCT 


ATTAATTGTT 


GCCGGGAAGC 


7050 
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TAGAGTAAGT AGTTCGCCAG TTAATAGTTT 
CTGCAGGCAT CGTGGTGTCA CGCTCGTCGT 
TCCGGTTCCC AACGATCAAG GCGAGTTACA 
AAAAGCGGTT AGCTCCTTCG GTCCTCCGAT 
CCGCAGTGTT ATCACTCATG GTTATGCCAG 
GTCATGCCAT CCGTAAGATG CTTTTCTGTG 
GTCATTCTGA GAATAGTGTA TGCGGCGACC 
CAACACGGGA TAATACCGCG CCACATAGCA 
ATTGGAAAAC GTTCTTCGGG GCGAAAACTC 
GAGATCCAGT TCGATGTAAC CCACTCGTGC 
CTTTTACTTT CACCAGCGTT TCTGGGTGAG 
GCCGCAAAAA AGGGAATAAG GGCGACACGG 
CTTCCTTTTT CAATATTATT GAAGCATTTA 
GCGGATACAT ATTTGAATGT ATTTAGAAAA 
CGCACATTTC CCCGAAAAGT GCCACCTGAC 
CATGACATTA ACCTATAAAA ATAGGCGTAT 
AA 



GCGCAACGTT 


GTTGCCATTG 


7100 


TTGGTATGGC 


TTCATTCAGC 


7150 


TCATCCCCCA TGTTGTGCAA 


7200 


CGITGTCAGA AGTAAGTTGG 


7250 


CACTGCATAA 


TTCTCTTACT 


7300 


ACTGGTGAGT 


ACTCAACCAA 


7350 


GAGTTGCTCT 


TGCCCGGCGT 


7400 


CAACTTTAAA 


AGTGCTCATC 


7450 


TCAAGGATCT 


TACCGCTGTT 


7500 


ACCCAACTGA 


TCTTCAGCAT 


7550 


CAAAAACAGG 


AAGGCAAAAT 


7600 


AAATGTTGAA 


TACTCATACT 


7650 


TCAGGGTTAT 


TGTCTCATGA 


7700 


ATAAACAAAT 


AGGGGTTCCG 


7750 


GTCTAAGAAA 


CCATTATTAT 


7800 


CACGAGGCCC 


TTTCGTCTTC 


7850 
7852 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9972 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
TCTTCCGCTT CCTCGCTCAC TGACTCGCTG CGCTCGGTCG TTCGGCTGCG 
GCGAGCGGTA TCAGCTCACT CAAAGGCGGT AATACGGTTA TCCACAGAAT 
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wawwww A X AA 


CGCAGGAA^C 
WWWAwwAA*iw 


AACATGTGAC 
AAwAX w X wAw 


C A A A AGGCC A 
wAAAAwwwwA 


GC A A A AGGCC 

WWAAAAWWWW 


150 

X</ w 


I/MA ACCGT A 


a a a ACftrrcr 

AAAAWWWWWW 


WX X WWX WW WW 


X X X X X WWAX a 


WWW X WW WW WW 


200 


CCCTGACGAG 

WWW X WAW VJAU 


CATC A C A AA A 
wax wiv/uuvn 


ATCGACGCTC 

A X WWAWW W X W 


AAGTCAGAGG 

AAW X WAW A WW 


TGGCGAAACC 


250 


CGACAGGACT 


ATAAAGATAC 


CAGGCGTTTC 

WAWW^ W W X X X W 


C^CCTGGAAG 


CTCCCTCGTG 


300 


OGO'PCTCCPG 

WWW X W A WWX W 


TTCCGACCCT 

X X vvwflWV X 


GCCGCTTACC 

ww www x inw 


GGATACCTGT 

WW#*XX»WW X w X 


CCGCCTTTCT 

«^ ^0 AAA A 


350 


CCCTTCGGGA 
www x x w w%vw<% 


AGCGTGGCGC 

AW WW X WWW WW 


TTTCTCATAG 

X X X W X WA X AW 


CTCACGCTGT 

W X WIlVwW X W X 


AGGTATCTCA 


400 


GTTCGGTGTA 

W X X Www X W X ** 


GGTCGTTCGC 


TCCAAGCTGG 


GCTGTGTGCA 


CGAACCCCCC 


450 


GTTCAGCCCG 

W X X WlWv W WW 


ACCGCTGCGC 

X»W WW W X www w 


CTTATCCGGT 

WX XXIX W WWW X 


AACTATCGTC 

^%AWXX&X WW X W 


TTGAGTCCAA 


500 


CCCGGTAAGA 

WW Www X AAUA 


CACGACTTAT 

WAWWAWX AA1 


CGCCACTGGC 

WW W WAW X WWW 


AGCAGCCACT 


GGTAACAGGA 


550 


TTAGCAGAGC 


G A GGTA TGT & 
wAwwX AXwXA 


wwwwwX WWX A 


WAwAwX X w X X 


GAAGTGGTGG 

W AA W X WW X ww 


600 


WW X AAW X AW W 


GCPACACTAG 
WW X A WAW X Aw 


A AG A AC AGT A 
AA w AA WAw X A 


X X XwwXAX wX 


GCGCPCPGCP 
w www X W X WW X 


650 


waAwwwAw X X 


Aw w X X wwwAA 


A A A Ar ,f PfV2/2 
AaAwawX X ww 


X AwwX w X X wA 


TCCGGC A A A C 
X w wwwwAAAw 


700 


AAAWWAWWwW 


x\a%a X Aw Www X 


wwX X X X 1 X X w 


XX XwwaAwWa 


urWAwAX XAww 


/ 9U 


S*f*f*lk/ 1 *ft ft ft ft ft 

CwwawaAAAA 


ft ft ft fTV ft 

AAwwAxwxCA 


ft O ft ft ft mM/yn 

AGAAwATwCT 


xXGATw*Tx'"X"X* 


w X Awwwww X w 


AAA 


m/* ft mii^ ft ^ 

TGACGCTCAG 


TGGAACGAAA 


ACTCACGTTA 


AGGGATTTTG 


G X wA X wAwA X 


o en 


Wftffl/^ft ft Ik "ft ft ^ 

TATwaAAAAw 


GATCTTCAuC 


TAGATCCTTT 


ffflft ft ft nvrtft ft * * 

TaAAX XAAAA 


AX wAAwX XXX 




AAAX wAAX wX 


ft ft ft <»*mft mft mft 


iv* & rmx ft ft fwn 
ruAuTAAACi 


TWTPTr' ft r>ft 

luvl wX uAtA 


/2TTACCA ATG 
wX XAwwAAX w 




/ *>im»ft Brn/ > «ft/«m 


/■» ft /"*/"• Oft OOHlft 

wAwwwawwTA 


TwTwawWwAT 


wTwTwX AX X X 


WwX XwaX WWA 


i nnn 

X V 


TAGTTGCCTG 


ft rwpr*^o^^vn^ 
ACTCCCCGTC 


/*mf*mft ft fit* ft 

wTwTAwATAA 


CIA w w A x A w w 


WWAWWWWX XA 




CCATCTGGCC 


CCAGTGCTGC 


AATGATACCG 


CCAGACCCAC 


GCTCACCGGC 


1100 


TCCAGATTTA 


TCAGCAATAA 


ACCAGCCAGC 


CGGAAGGGCC 


GAGCGCAGAA 


1150 


6TGGTCCT6C 


AACTTTATCC 


GCCTCCATCC 


AGTCTATTAA 


TTGTTGCCGG 


1200 


6AAGCTA6A6 


TAAGTAGTTC 


GCCAGTTAAT 


AGTTTGCGCA 


ACGTTGTTGC 


1250 


CATTGCTACA 


GGCATCGTGG 


TGTCACGCTC 


GTCGTTTGGT 


ATGGCTTCAT 


1300 


TCAGCTCCGC 


TTCCCAACGA 


TCAAGGCGAG 


TTACATGATC 


CCCCATGTTG 


1350 


TGCAAAAAAG 


CGGTTAGCTC 


CTTCGGTCCT 


CCGATCGTTG 


TCAGAAGTAA 


1400 
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CTTGGCCGCA 

W X X wUvUwWl 


GTGTTATCAC 


TCATGGTTAT 


GGCAGCACTG 


CATAATTCTC 


1450 


J, X nv X W X W A X 


GCCATCCGTA 


AGATGCTTTT 


CTGTGACTGG 


TGAGTACTCA 


1500 


awwaaw x vn x 


TCTGAGAATA 


GTGTATGCGG 


CGACCGAGTT 


GCTCTTGCCC 


1550 


GGCGTCAATA 

Www X VflnAn 


CGGGATAATA 


CCGCGCCACA 


TAGCAGAACT 


TTAAAAGTGC 


1600 


TPATCATTGG 

X Wl X W«\ X aww 


AAAACGTTCT TCGGGGCGAA AACTCTCAAG GATCTTACCG 


1650 


PTGTTGAGAT 

W X W x x WX* w.A A 


CCAGTTCGAT 


GTAACCCACT 


CGTGCACCCA 


ACTGATCTTC 


1700 


awwax w x x x x 


ACTTTCACCA 


GCGTTTCTGG 


GTGAGCAAAA 


ACAGGAAGGC 


1750 


A A & ATGCCGC 


AAAAAAGGGA 


ATAAGGGCGA 


CACGGAAATG 


TTGAATACTC 


1800 


ax nv x \* * a ww 


TTTTTCAATA 


TTATTGAAGC 


ATTTATCAGG 


GTTATTGTCT 


1850 


PATCAGPGGA 
wa x vanuuuuA 


TACATATTTG 


AATGTATTTA 


GAAAAATAAA 


CAAATAGGGG 


1900 


X X vvbwuUnv 


ATTTCCCCGA 


AAAGTGCCAC 


CTGACGTCTA 


AGAAACCATT 


1950 




CATTAACCTA 


TAAAAATAGG 


CGTATCACGA 


GGCCCTTTCG 


2000 


XwX wwwwwwX 


TTCGGTGATG 


ACGGTGAAAA 


CCTCTGACAC 


ATGCAGCTCC 


2050 


WwV?AwAw X 


CACAGCTTGT 


CTGTAAGCGG 


ATGCCGGGAG 


CAGACAAGCC 


2100 


ww X vAbuutu 


VJw X VAUUlUij 


lul X www www 


X W X wwwWwW M 


GGCTTAACTA 


2150 


X GwwwwaX wa 


GAGCAGATTG 


TACTGAGAGT 


GCACCATAAA 


ATTGTAAACG 


2200 


X X aAX Ai XXX 


GTTAAAATTC 


GCGTTAAATT 


TTTGTTAAAT 


CAGCTCATTT 


2250 


X X X AAwwAAA 


AGGCCGAAAT 


CGGCAAAATC 


CCTTATAAAT 


CAAAAGAATA 


2300 


CfWYtAfSATA 

wwwwwAwA X A 


GGGTTGAGTG 


TTGTTCCAGT 


TTGGAACAAG 


AGTCCACTAT 


2350 


TAAAGAACGT 


GGACTCCAAC 


GTCAAAGGGC 


GAAAAACCGT 


CTATCAGGGC 


o Ann 


GATGGCCCAC 


TACGTGAACC 


ATCACCCAAA 


TCAAGTTTTT 


TGGGGTCGAG 


2450 


GTGCCGTAAA 


GCACTAAATC 


GGAACCCTAA 


AGGGAGCCCC 


CGATTTAGAG 


2500 


CTTGACGGGG 


AAAGCCGGCG 


AACGTGGCGA 


GAAAGGAAGG 


GAAGAAAGCG 


2550 


AAAGGAGCGG 


GCGCTAGGGC 


GCTGGCAAGT 


GTAGCGGTCA 


CGCTGCGCGT 


2600 


AACCACCACA 


CCCGCCGCGC 


TTAATGCGCC 


GCTACAGGGC 


GCGTACTATG 


2650 


GTTGCTTTGA 


CGTATGCGGT 


GTGAAATACC 


GCACAGATGC 


GTAAGGAGAA 


2700 
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AATACCGCAT 


CAGGCGCCAT 

^^mrn^m^m ^0^m^0^&mm>mm 


TCGCCATTCA 


GGCTGCGCAA 

^0^0 ^0 ^ ^r^m ^mm mm m 


CTGTTGGGAA 


2750 


GGGCGATCGG 


TGCGGGCCTC 


TTCGCTATTA 

* * %^^P ^0 m* mm mm mm mm 


CGCCAGCTGG 


CGAAAGGGGG 


2800 


ATGTGCTGCA 

fix x w w- x wwn 


AGG CGATT AA 

#*www%w#*x xn<& 


GTTGGGTAAC 

WX XwwwXAf^W 


GCCAGGGTTT 


TCCCAGTCAC 

m» %0^0^0Mm>%m ^0rnm>^0 


2850 


GACGTTGTAA 

unwu X X w X AA 


AACGACGGCC 
nnvun www w w 


AGTGCCAAGC 

f&wX WWWfMlWW 


T^AAGGTGCA 

X X AAUW X 


CGGCCCACGT 


2900 


GGCCACTAGT 

UUVWIV X Aw X 


& CTTCTCG AG 

X* W X X W X W Wl w 


CTCTGTACAT 

W X WX w 


GTCCGCGGTC 


GCGACGTACG 


2950 


CGTATCGATG 

Ww X AX W WAX w 


GCGCCAGCTG 

wvVJVvflSJv X w 


CAGGCGGCCG 


CCATATGCAT 


CCTAGGCCTA 


3000 


TTAATATTCC 


GGAGTATACG 


TAGCCGGCTA 


ACGTTAACAA 


CCGGTACCTC 


3050 


TAGAACTATA 


GCTAGCCAAT 


TCCATCATCA 


ATAATATACC 


TTATTTTGGA 


3100 


TTGAAGCCAA 
x x w*%**www*w» 


TATG ATAATG 

XXV J. WXkAXV^X \9 


AGGGGGTGGA 


GTTTGTGACG 

AAA A WCTVw 


TGGCGCGGGG 


3150 


CGTGGGAACG 


GGGCGGGTGA 


CGTAGGTTTT 

W W X AVJU X X X X 


AGGGCGGAGT 

X>WW*i7 w*Vrf wl»w A 


AACTTGTATG 


3200 


TGTTGGG A AT 


TGT A GTTTTP 

X w X Aw X X X X W 


TTAAAATGGG 

X X AAAAX www 


AAGTTACGTA 

AAwX XAWwXA 


ACGT^GGGAAA 


3250 


A Www AAV X \JA 


wwAX x lunlau 


& ACTTCTGGfl 

AA w iiU X www 


X X X X X X www X 


TT PGTTTCTC 

X X W w X X X W X w 


3300 


uuLulnViui 1 


vuUVj x l» Uuu X 


X IXtlvwXu 


X X X X X XwXuu 


APTTTAAPPfl 

AwX X X AAwww 


3350 


x XACGX GAX 1 


XXX lAvilLLl 


AlAiAXAUlt 


ww X w XuUAU X 


X ww www X X X X 




1 1 AuVwl G xvi 


Aw XX* AX IvAu 




bltVjAVaXbbl 


uXXXXXX X AA 




XAwVaX X X Iwl 


X X X X In^lub 


X Anuvv X vAw 


Xu X X nwt Xw 


WwwW X w X wAA 


3500 


Uww^ x ulnl w 






ww X AX X X X ww 


CTAGGCAGGA 

W X ^Iww VAVWl 


3550 


WJUl XXX X WA 




X Ul X X X X WX W 


TPPT ATT A AT 

X wWXAX XAAX 


TTTGTTATAC 


3600 




GGCTGTAATG 


T'lWPPTCTAP 

X X w X w X w X Aw 


GCCTGCGGGT 

Uwv X w Ww^*w X 


ATGTATTCCC 

AAVAA4 A W^WW 


3650 


CCCAAGCTTG 


CATGCCTGCA 


GGTCGACTCT 


AGAGGATCCG 


AAAAAACCTC 


3700 


CCACACCTCC 


CCCTGAACCT 


GAAACATAAA 


ATGAATGCAA 


TTGTTGTTGT 


3750 


TAACTT6TTT 


ATTGCAGCTT 


ATAATGGTTA 


CAAATAAAGC 


AATAGCATCA 


3800 


CAAATTTCAC 


AAATAAAGCA 


TTTTTTTCAC 


TGCATTCTAG 


TTGTGGTTTG 


3850 


TCCAAACTCA 


TCAATGTATC 


TTATCATGTC 


TGGATCCCCC 


TAGCTTGCCA 


3900 


AACCTACAGG 


TGGGGTCTTT 


CATTCCCCCC 


TTTTTCTGGA 


GACTAAATAA 


3950 


AATCTTTTAT 


TTTATCTATG 


GCTCGTACTC 


TATAGGCTTC 


AGCTGGTGAT 


4000 
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ATTGTTGAGT CAAAACTAGA GCCTGGACCA CTGATATCCT GTCTTTAACA 4050 

AATTGGACTA ATCGCGGGAT CAGCCAATTC CATGAGCAAA TGTCCCATGT 4100 

CAACATTTAT GCTGCTCTCT AAAGCCTTGT ATCTTGCATC TCTTCTTCTG 4150 

TCTCCTCTTT CAGAGCAGCA ATCTGGGGCT TAGACTTGCA CTTGCTTGAG 4200 

TTCCGGTGGG GAAAGAGCTT CACCCTGTCG GAGGGGCTGA TGGCTTGCCG 4250 

GAAGAGGCTC CTCTCGTTCA GCAGTTTCTG GATGGAATCG TACTGCCGCA 4300 

CTTTGTTCTC TTCTATGACC AAAAATTGTT GGCATTCCAG CATTGCTTCT 4350 

ATCCTGTGTT CACAGAGAAT TACTGTGCAA TCAGCAAATG CTTGTTTTAG 4400 

AGTTCTTCTA ATTATTTGGT ATGTTACTGG ATCCAAATGA GCACTGGGTT 4450 

CATCAAGCAG CAAGATCTTC GCCTTACTGA GAACAGATCT AGCCAAGCAC 4500 

ATCAACTGCT TGTGGCCATG GCTTAGGACA CAGCCCCCAT CCACAAGGAC 4550 

AAAGTCAAGC TTCCCAGGAA ACTGTTCTAT CACAGATCTG AGCCCAACCT 4600 

CATCTGCAAC TTTCCATATT TCTTGATCAC TCCACTGTTC ATAGGGATCC 4650 

AAGTTTTTTC TAAATGTTCC AGAAAAAATA AATACTTTCT GTGGTATCAC 4700 

TCCAAAGGCT TTCCTCCACT GTTGCAAAGT TATTGAATCC CAAGACACAC 4750 

CATCGATCTG GATTTCTCCT TCAGTGTTCA GTAGTCTCAA AAAAGCTGAT 4800 

AACAAAGTAC TCTTCCCTGA TCCAGTTCTT CCCAAGAGGC CCACCCTCTG 4850 

GCCAGGACTT ATTGAGAAGG AAATGTTCTC TAATATGGCA TTTCCACCTT 4900 

CTGTGTATTT TGCTGTGAGA TCTTTGACAG TCATTTGGCC CCCTGAGGGC 4950 

CAGATGTCAT CTTTCTTCAC GTGTGAATTC TCAATAATCA TAACTTTCGA 5000 

GAGTTGGCCA TTCTTGTATG GTTTGGTTGA CTTGGTAGGT TTACCTTCTG 5050 

TTGGCATGTC AATGAACTTA AAGACTCGGC TCACAGATCG CATCAAGCTA 5100 

TCCACATCTA TGCTGGAGTT TACAGCCCAC TGCAATGTAC TCATGATATT 5150 

CATGGCTAAA GTCAGGATAA TACCAACTCT TCCTTCTCCT TCTCCTGTTG 5200 

TTAAAATGGA AATGAAGGTA ACAGCAATGA AGAAGATGAC AAAAATCATT 5250 

TCTATTCTCA TTTGGAACCA GCGCAGTGTT GACAGGTACA AGAACCAGTT 5300 
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GGCAGTATGT AAATTCAGAG CTTTGTGGAA 
GCCGTCCGAA GGCACGAAGT GTCCATAGTC 
TGAGTGAAAA TTGGACTCCT GCCTTCAGAT 
TGAGGTTTGG AGGAAATATG CTCTCAACAT 
GCACTGTTGC AACAAAGATG TAGGGTTGTA 
GCTCCAATCA CAATTAATAA CAACTGGATG 
CAGAAGGTCA TCCAAAATTG CTATATCTTT 
CACCTGCTTT CAACGTGTTG AGGGTTGACA 
TGTAACATTT TGTGGTGTAA AATTTTCGAC 
CAGTGGTAGA CCTCTGAAGA ATCCCATAGC 
CCACGTAAAT GTAAAACACA TAATACGAAC 
GCATAGCTGT TATTTCTACT ATGAGTACTA 
AGTGTTTCCA AGGAGCCACA GCACAACCAA 
GAAAAATTAC TAAGCACCAA ATTAGCACAA 
GTAATATATC GAAGGTATGT GTTCCATGTA 
CATATCATCA AAAAAGCACT CCTTTAAGTC 
TTATTTCCAA GCCAGTTTCT TGAGATAACC 
TCAGTCAAGT TTGCCTGAGG GGCCAGTGAC 
TGTCTTTCGG TGAATGTTCT GACCTTGGTT 
TCAGGACAGA CTGCCTCCTT CGTGCCTGAA 
ACGCTGATGC GAGGCAGTAT CGCCTCTCCC 
GGACAGCCTT CTCTCTAAAG GCTCATCAGA 
TTTGTAAGGG AGTCTTTTGC ACAATGGAAA 
GGATTGAGAA TAGAATTCTT CCTTTTTTCC 
AAAAGATTGT TTTTTTGTTT CTGTCCAGGA 
ATGAGAAACG GTGTAAGGTC TCAGTTAGGA 



CAGAGTTTCA 


AAGTAAGGCT 


5350 


CTTTTAAGCT 


TGTAACAAGA 


5400 


TCCAGTTGTT 


TGAGTTGCTG 


5450 


AATAAAAGCC 


ACTATCACTG 


5500 


AAACTGCGAC 


AACTGCTATA 


5550 


AAGTCAAATA 


TGGTAAGAGG 


5600 


GGAGAATCTA 


TTAAGAATCC 


5650 


TAGGTGCTTG 


AAGAACAGAA 


5700 


ACTGTGATTA 


GAGTATGCAC 


5750 


AAGCAAAGTG 


TCGGCTACTC 


5800 


TGGTGCTGGT 


GATAATCACT 


5850 


TTCCCTTTGT 


CTTGAAGAGG 


5900 


AGAAGCAGCC 


ACCTCTGCCA 


5950 


AAATTAAGCT 


CTTGTGGACA 


6000 


GTCACTGCTG 


GTATGCTCTC 


6050 


TTCTTCGTTA 


ATTTCTTCAC 


6100 


TTCTTGAATA 


TATATCCAGT 


6150 


ACTTTTCGTG 


TGGATGCTGT 


6200 


AACTGAGTGT 


GT CATCAGGT 


6250 


GCGTGGGGCC 


AGTGCTGATC 


6300 


TGCTCAGAAT 


CTGGTACTAA 


6350 


ATCCTCTTCG 


ATGCCATTCA 


6400 


ATTTTCGTAT 


AGAGTTGATT 


6450 


CCAAACTCTC 


CAGTCTGTTT 


6500 


GACAGGAGCA 


TCTCCTTCTA 


6550 


TTGAATTTCT 


TCTTTCTGCA 


6600 
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CTAAATTGGT CGAAAGAATC ACATCCCATG AGTTTTGAGC TAAAGTCTGG 
CTGTAGATTT TGGAGTTCTG AAAATGTCCC ATAAAAATAG CTGCTACCTT 
CATGCAAAAT TAATATTTTG TCAGCTTTCT TTAAATGTTC CATTTTAGAA 
GTGACCAAAA TCCTAGTTTT GTTAGCCATC AGTTTACAGA CACAGCTTTC 
AAATATTTCT TTTTCTGTTA AAACATCTAG GTATCCAAAA GGAGAGTCTA 
ATAAATACAA ATCAGCATCT TTGTATACTG CTCTTGCTAA AGAAATTCTT 
GCTCGTTGAC CTCCACTCAG TGTGATTCCA CCTTCTCCAA GAACTATATT 
GTCTTTCTCT GCAAACTTGG AGATGTCCTC TTCTAGTTGG CATGCTTTGA 
TGACGCTTCT GTATCTATAT TCATCATAGG AAACACCAAA GATGATATTT 
TCTTTAATGG TGCCAGGCAT AATCCAGGAA AACTGAGAAC AGAATGAAAT 
TCTTCCACTG TGCTTAATTT TACCCTCTGA AGGCTCCAGT TCTCCCATAA 
TCATCATTAG AAGTGAAGTC TTGCCTGCTC CAGTGGATCC AGCAACCGCC 
AACAACTGTC CTCTTTCTAT CTTGAAATTA ATATCTTTCA GGACAGGAGT 
ACCAAGAAGT GAGAAATTAC TGAAGAAGAG GCTGTCATCA CCATTAGAAG 
TTTTTCTATT GTTATTGTTT TGTTTTGCTT TCTCAAATAA TTCCCCAAAT 
CCCTCCTCCC AGAAGGCTGT TACATTCTCC ATCACTACTT CTGTAGTCGT 
TAAGTTATAT TCCAATGTCT TATATTCTTG CTTTTGTAAG AAATCCTGTA 
TTTTGTTTAT TGCTCCAAGA GAGTCATACC ATGTTTGTAC AGCCCAGGGA 
AATTGCCGAG TGACCGCCAT GCGCAGAACA ATGCAGAATG AGATGGTGGT 
GAATATTTTC CGGAGGATGA TTCCTTTGAT TAGTGCATAG GGAAGCACAG 
ATAAAAACAC CACAAAGAAC CCTGAGAAGA AGAAGGCTGA GCTATTGAAG 
TATCTCACAT AGGCTGCCTT CCGAGTCAGT TTCAGTTCTG TTTGTCTTAA 
GTTTTCAATC ATTTTTTCCA TTGCTTCTTC CCAGCAGTAT GCCTTAACAG 
ATTGGATGTT CTCGATCATT TCTGAGGTAA TCACAAGTCT TTCACTGATC 
TTCCCAGCTC TCTGATCTCT GTACTTCATC ATCATTCTCC CTAGCCCAGC 
CTGAAAAAGG GCAAGGACTA TCAGGAAACC AAGTCCACAG AAGGCAGACG 



6650 

6700 

6750 

6800 

6850 

6900 

6950 

7000 

7050 

7100 

7150 

7200 

7250 

7300 

7350 

7400 

7450 

7500 

7550 

7600 

7650 

7700 

7750 

7800 

7850 

7900 
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CCTGTAACAA CTCCCAGATT AGCCCCATGA 
GCGATCCACA CGAAATGTGC CAATGCAAGT 
GTTGTTGGAA AGGAGACTAA CAAGTTGTCC 
CACGGCTTGA CAGCTTTAAA GTCTTCTTAT 
ATTCTCATCT GCATTCCAAT GTGATGAAGG 
GAGCAGTGTC CTCACAATAA AGAGAAGGCA 
TCGCGATAGA GCGTTCCTCC TTGTTATCCG 
CTTCCCAGTA AGAGAGGCTG TACTGCTTTG 
AAAGATTCCA TAGAACATAA ATCTCCAGAA 
TAATGAGTTT AGGATTTTTC TTTGAAGCCA 
TCCAATTTTT CAGATAGATT GTCAGCAGAA 
TATGTCTGAC AATTCCAGGC GCTGTCTGTA 
TGGTCCAGCT GAAAAAAAGT TTGGAGACAA 
GACCTCTGCA TGGTCTCTCG GGCGCTGGGG 
GCTCAAGCTC CTAATGCCAA AGGAATTCCT 
GTTCTAGAGC GGCCGCCACC GCGGTGGCTG 
GCGCTTCGCT TTTTATAGGG CCGCCGCCGC 
AAACTTTCGG AGCGCGCCGC TCTGATTGGC 
TCGCCCCGCC CCGCCCCTCG CCCCGCCCCG 
CCCCCCCCCC CCGCCCCCAT CGCTGCACAA 
ATACAAAATT GGGGGTGGGG AGGGGGGGGA 
ACGTGGCCTC GAGTAGATGT ACTGCCAAGT 
TGTACTGGGC ATAATGCCAG GCGGGCCATT 
GGGGGCGTAC TTGGCATATG ATACACTTGA 
TTACCGTAAA TACTCCACCC ATTGACGTCA 
TTACTATGGG AACATACGTC ATTATTGACG 



6GA6TGCCAC 


TTGCAAAGGA 


7950 


CCTTCATCAA 


ATTTGTTCAG 


8000 


AATACTTATT 


TTATCTAGAA 


8050 


AAATCAAACT 


AAACATAGCT 


8100 


CCAAAAATGG 


CTGGGTGTAG 


8150 


TAAGCCTATG 


CCTAGATAAA 


8200 


GGTCATAGGA 


AGCTATGATT 


8250 


GTGACTTCCC 


CTAAATATAA 


8300 


AAAACATCGC 


CGAAGGGCAT 


8350 


GCTCTCTATC 


CCATTCTCTT 


8400 


TCAACAGAAG 


GGATTTGGTA 


8450 


TCCTTTCCTC 


AAAATTGGTC 


8500 


CGCTGGCCTT 


TTCCAGAGGC 


8550 


TCCCTGCTAG 


GGCCGTCTGG 


8600 


GCAGCCCGGG 


GGATCCACTA 


8650 


ATCCCGCTCC 


CGCCCGCCGC 


8700 


CGCCGCCTCG 


CCATAAAAGG 


8750 


TGCCGCCGCA 


CCTCTCCGCC 


8800 


CCCCGCCTGG 


CGCGCGCCCC 


8850 


AATAATTAAA 


AAATAAATAA 


8900 


GATGGGGAGA 


GTGAAGCAGA 


8950 


AGGAAAGTCC 


CATAAGGTCA 


9000 


TACCGTCATT 


GACGTCAATA 


9050 


TGTACTGCCA 


AGTGGGCAGT 


9100 


ATGGAAAGTC 


CCTATTGGCG 


9150 


TCAATGGGCG 


GGGGTCGTTG 


9200 
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GGCGGTCAGC CAGGCGGGCC ATTTACCGTA AGTTATGTAA CGACCTGCAG 9250 

GCTGATCTCC CTAGACAAAT ATTACGCGCT ATGAGTAACA CAAAATTATT 9300 

CAGATTTCAC TTCCTCTTAT TCAGTTTTCC CGCGAAAATG GCCAAATCTT 9350 

ACTCGGTTAC GCCCAAATTT ACTACAACAT CCOCCTAAAA CCGCGCGAAA 9400 

ATTGTCACTT CCTGTGTACA CCGGCGCACA CCAAAAACGT CACTTTTGCC 9450 

ACATCCGTCG CTTACATGTG TTCCGCCACA CTTGCAACAT CACACTTCCG 9500 

CCACACTACT ACGTCACCCG CCCCGTTCCC ACGCCCCGCG CCACGTCACA 9550 

AACTCCACCC CCTCATTATC ATATTGGCTT CAATCCAAAA TAAGGTATAT 9600 

TATTGATGAT GCTAGCATGC GCAAATTTAA AGCGCTGATA TCGATCGCGC 9650 

GCAGATCTGT CATGATGATC ATTGCAATTG GATCCATATA TAGGGCCCGG 9700 

GTTATAATTA CCTCAGGTCG ACGTCCCATG GCCATTCGAA TTCGTAATCA 9750 

TGGTCATAGC TGTTTCCTGT GTGAAATTGT TATCCGCTCA CAATTCCACA 9800 

CAACATACGA GCCGGAAGCA TAAAGTGTAA AGCCTGGGGT GCCTAATGAG 9850 

TGAGCTAACT CACATTAATT GCGTTGCGCT CACTGCCCGC TTTCCAGTCG 9900 

GGAAACCTGT CGTGCCAGCT GCATTAATGA ATCGGCCAAC GCGCGGGGAG 9950 

AGGCGGTTTG CGTATTGGGC GC 9972 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
TAGTAAATTT GGGC 14 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 
AGTAAGATTT GGCC 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6 
AGTGAAATCT GAAT 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 
GAATAATTTT GTGT 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: d uble 

(D) TOPOLOGY: unknown 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CGTAATATTT GTCT 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
WANWTTTG 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19307 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

CCAATTCCAT CATCAATAAT ATACCTTATT TTGGATTGAA GCCAATATGA 50 

TAATGAGGGG GTGGAGTTTG TGACGTGGCG CGGGGCGTGG GAACGGGGCG 100 

GGTGACGTAG GTTTTAGGGC GGAGTAACTT GTATGTGTTG GGAATTGTAG 150 

TTTTCTTAAA ATGGGAAGTT ACGTAACGTG GGAAAACGGA AGTGACGATT 200 

TGAGGAAGTT GTGGGTTTTT TGGCTTTCGT TTCTGGGCGT AGGTTCGCGT 250 

GCGGTTTTCT GGGTGTTTTT TGTGGACTTT AACCGTTACG TCATTTTTTA 300 

GTCCTATATA TACTCGCTCT GCACTTGGCC CTTTTTTACA CTGTGACTGA 350 

TTGAGCTGGT GCCGTGTCGA GTGGTGTTTT TTTAATAGGT TTTCTTTTTT 400 
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ACTGGTAAGG CTGACT6TTA 6GCTGCCGCT 
CTGGA6C6GG AGGGTGCTAT TTTGCCTAGG 
TTATGTGTTT TTCTCTCCTA TTAATTTTGT 
TAATGTTGTC TCTACGCCTG CGGGTATGTA 
CTGCAGGTCG ACTCTAGAGG ATCCGAAAAA 
AACCTGAAAC ATAAAATGAA TGCAATTGTT 
AGCTTATAAT GGTTACAAAT AAAGGAATAG 
AAGCATTTTT TTCACTGCAT TCTAGTTGTG 
GTATCTTATC ATGTCTGGAT CCCCGCGGCC 
CCCGGGCTGC AGGAATTCCG TAACATAACT 
CAGTAAAGCA GTAATATAAT ACAATAGTAA 
TGATATGTTG TGAAAATGCA GTAAAACTGA 
AAATGTTACA GTGTTGGTGT TAAAACACAA 
AGAGTCCAGT ACCTGGAGAC AATGATGATA 
CTTCAGTTAC ACTGATTATG ATTTACACTT 
GAACATGAAA TGATGTCCAA ATTATGCTTA 
AGTTTTTATT CAAATATTTT GATAGATTCA 
AGATAAAACG AAAAGATTAA AACAAAACTA 
TTTTAGAATG AAACTTAAAA CTTCTTAGTA 
AAATCTTGGT GAAAACAAAT CCTTGGATAA 
TAAAGGAGAG AGAGAGAGAA AAGCAAGACC 
TATCTTAGAG CTTTGGGTTT TCTTTTGGAA 
ACTGGTGTCC ACACAACAGA CAAGTGGTGA 
ACAATTACTA GAAACACCCC AAAACCAAAG 
AAGCTGTGTT TGATGTTAAT TACAATTAAT 
TAGAAGTTAA TTACACTTGA CGTTAGAGGT 



GTGAAGCGCT 


GTATGTTGTT 


450 


CAGGAGGGTT 


TTTCAGGTGT 


500 


TATACCTCCT 


ATGGGGGCTG 


550 


T^CCCCCCAA 


GCTTGCATGC 


600 


ACCTCCCACA 


CCTCCCCCTG 


650 


GTTGTTAACT 


TGTTTATTGC 


700 


CATCACAAAT 


TTCACAAATA 


750 


GTTTGTCCAA 


ACTCATCAAT 


800 


GCTCTAGAAC 


TAGTGGATCC 


850 


GCGTGCTTTA 


TTGAGATACA 


900 


GGCATATATT 


TGGTGAAATC 


950 


AGTTTAAAAA 


AATAATTAGT 


1000 


TCTATTATGA 


TACTCAAGTA 


1050 


CATGCCATGT 


GATGATTATG 


1100 


TAATACTTGA 


TGGTTATAAA 


1150 


AAATCAGCAA 


TAAAGCTCTC 


1200 


CTCCAGAACT 


AATATCTAAA 


1250 


TGCACTCTAT 


CTACCTTGGA 


1300 


GGAAAGGAAC 


CCCTTGTTTT 


1350 


AGAAAATGCC 


CAGTGCCACA 


1 A ft ft 

14UU 


AGAACCAAAT 


TTCAATTTGT 


1450 


ATTATAAATG 


AAAAAAGGAA 


1500 


AGTTGTGAAA 


TTAGGTGTGC 


1550 


TGAGGTAGAA 


ATA6CATGA6 


1600 


AATGGACAAA 


ACCCACTCGC 


1650 


AAGAGATTTG 


CAAAAT6ATA 


1700 
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GGACAGTGAT TTCTATTGAG AGAATGCTCT 
ACTGGCATGA GAGGAGTAAA GCTCTTCCTA 
GCACTTTTTC TCCTGGTTCA ATGACTTGCA 
CGTCAACTAG ACCAGAGAGT TTGGAGACGC 
AACCACTGTG CCTTCTCACC CACAATCCTG 
AACCAATGCA AAGGAGACAA ATGCAGTTCA 
CACGAGGGTC ACAATGTGAT TGGGTTACTT 
CTTGCAGCAT TAAAAAAAAA AATCATCACA 
AAAATCTAAA ATCTAAAATT CATCATCATC 
CAACAACAAA ACCACCCACT TCAGGTTGAG 
ATTTAGTTGT AATTATAGAG ATGTTTATAT 
CCATTCTTTT ACAGAGTTGT TGCTCCCCTC 
CGCAACCTTT AGCTCCTACC ATCTTCCTCC 
TGTCATCTGA TGTTCTATTG CAGAAACATC 
AGGAAGTTGA ATATATGAGC CAACAAATTA 
TATTCATTCG CATGTTCCTT GAAAAAAATG 
AAAGTTTAAA ACTAGAAACA TCTGGAGCCC 
CGGTAGTCTC CTGGCTTTGG GCTCCAGGGA 
AGATAAGCCC AGATGACTAG AAGCAATTTC 
ATTTGAAGAA GTAACTTCAT ATCTATTTAT 
TATACTTGTA GACATATAGA TGTATAAAAT 
ACTCAGTCAA CAATTCTCAA AAGAGCAATA 
GTTCGTATGC AAGAAAATAA AAAAACGTCA 
CGCTAAAGTA ATGCAAAACA ATGTGCTGCC 
TGTGTGTGTG GTGGGTTCGT GCATGTATGT 
GTGTGTGTGT GTGTGTGTGC GTGTGTGTTT 



TTAAATGCTA 


AGAAGAAGAA 


1750 


GCAGTCCTTA 


GCTTTCTGTT 


1800 


TTTGTTTAGA 


CATTTCAGCC 


1850 


Tl rTGCTCTC 


AAAACTTTCC 


1900 


TGTGGAGTTA 


CTTGCAGGGA 


1950 


TGGGCTTCTG 


GACTGATATT 


2000 


TCTTAACAGT 


AATCCTAAGT 


2050 


ATGAAGAAAA 


AAAAACCCAA 


2100 


ATCAACAACA 


ACAACAACAA 


2150 


TTTATGAAGA 


GGGGAGAACA 


2200 


GTATAGTTGT 


AAATATTCAT 


2250 


ATATAAATTG 


ACTGAGGAGC 


2300 


TACTGTCTGG 


GAGTTAAAAA 


2350 


ATTAAATATA 


ACCCAACAGT 


2400 


CTATGATAGT 


AAGTCCTGTG 


2450 


AATCCTCTAG 


CTCTCAGTGG 


2500 


TAGAGAATAT 


TTTAGTGTGG 


2550 


AAATTCACTC 


TTGCCCAAGC 


2600 


CATTAGGAAG 


TGGCAAGAAC 


2650 


CTATATACCT 


AlAolAl 1 XA 


2700 


GAAAGCCCAT 


AGCCAGCCCC 


2750 


TGAAGCAGTC 


ATTTGGTGGG 


2800 


TGAATTCCAT 


ATGAATACCA 


2850 


TCAGTGTGTG 


TGTGTGTGTG 


2900 


GTGCGTGTGT 


GTGTGTGTGT 


2950 


GTTTAGGGGT 


TTTTATAAAC 


3000 
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AACTTTTTTT ATAAAGCACA CTTTAGTTTA 
TATAAATTTT TAAACAACCC AAAATGCGTT 
TTATTTAGCT ATCAAGATTT TACATGTTTT 
TTGCATAGAC GTGTAAAACC TGCCATTGTT 
AGAAACTACT GAAATCTACA GTATAGTACC 
AGATTTTATT TCTTGTAAAC TCTTACTGTC 
ATATTATAAA AACCATGCGG GAATCAGGAG 
TCCTTCTTCA TCTGTCATGA CTGAAACTAA 
AATCATCTGC CATGTGGAAA AGGCTTCCTA 
GCTTTCCGGG GGCATTTCTT CCTCTTGAAC 
TGCTCCATCA CTTCTTCTAA CCCTGTGCTT 
AAGATCTTCC TCACCCATAG ATTCTGAAGT 
GCAGCATAGG CTGACTGCTA TCTGACCTCT 
GACACCGTGG TGCCATTCAC CTTAGCTTCA 
CTGTCTCAGT CTATGTAACT GAGACTCCAG 
GGATTTGCAT CCTGGCTTCC AGGCGTCCTT 
GCCTCAGCAA TGAGCTCAGC ATCCCTGGGA 
CATCTCAGGA GGAGATGGCA GTGGAGACAG 
GCTTCAGGCG ATCATATTCT GCTTGCAGAT 
TCTGCTAGGA TTCTCTCTAG CTCCCCTCTT 
CAAGATCTGG GCAGGACTAC GAGGCTGGCT 
AACTTTGGCA GTAATGCTGG ATTAACAAAT 
TTAGGAGAGA TGCTATCATT TAGATAAGAT 
TGCTAGCCTG CTAGCATAAT GTTCAATGCG 
AAAGCTGGGG GGACGAGGCA GGCGCAGAAT 
AGAGTAACGG GAGTTTCCAT GTTGTCCCCC 



CAATCTCTCT 


TTATAACTGT 


3050 


CCATATAAAG 


AAATGGCAAG 


3100 


CTTTTAACTT 


TTTTGTACAA 


3150 


AACAAAACAA 


TAACAGACTT 


3200 


ACTACCCTTC 


ACAAAAATAT 


3250 


TAATCCTCTT 


TGTTGTACGA 


3300 


TTGTAAAACA 


TTTATTCTGC 


3350 


GGACTCCATC 


GCTCTGCCCA 


3400 


CATTGTGTCC 


TCTCTCATTG 


3450 


TAGGGAAGGA 


GTTGTTGAGT 


3500 


X X V» A V^**%* 


GAGGACTCAG 


3550 


TTGACTGCCA 


ACCACTCGGA 


3600 


GCAGAGAGGT 


GGAAGGAGAG 


3650 


GCC*T*GGGGC?F 

Www X WW W A 


GCTCCAGGAG 


3700 


WXWX X XAX X\3 


TGGTcTTCCA 


3750 


XWXw X X WWW 


CAGTAGCTTA 


3800 


CTCTGAGGAG 


AGGT6G6CAT 


3850 


GCCTTTATGC 


TCAT6CT6CT 


3900 


TCCTGTTTTC 


TTCCTCAAGA 


3950 


TCCTCACTCT 


CTAAGGAAAT 




CAGGGGGGAG 


TCCTGGTTCA 


4050 


GTTCATCATC 


TATGCTCTCA 


4100 


CCATTGCTGT TTTCCATTTC 


4150 


TGAATGAGTA TCATCGTGTG 


4200 


CTACTGGCCA 


GAAGTTGATC 


4250 


TCTAACACAG 


TCTGCACTGG 


4300 
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CAGGTAGCCC ATTCGGGGAT GCTTCGCAAA 
TGTTTTTTAG TACCTTGGCG AAGTCGCGAA 
GGAGTGCAAT ACTCTACCAT GGGGTAGTGC 
TCGGCCAGAA AAAAAGCAAC TTTGGCAGAT 
GGCTTCTGTA CCTGAATCCA ATGATTGGAC 
TTGGCTTGAT GCTTGGCAGT TTCAGCAGCA 
CAGCCACACC ATAGACTGGG GTTCCAGGCG 
CAGCTTCAAT CTCAGGTTTA TTATTGGCAA 
CTCGGCTCAA TGTTACTGCC CCCAAAGGAA 
TGGGATTTGA ATAGAATCAT GCAGAAGAAG 
AAAAGCCAGT TGAACTTGCC ACTTGCTTGA 
TCCAAGTGTG CTTTACACAG AGAAATGATG 
ACGGATCCTC CCTGTTCGTC CCGTATCATA 
GACACATATC CAGACAGAGA GGGACATTGA 
TCCAGACGAT CATAAATTGT AGTCAAACAG 
CATGGGCTGG TCATTTTGCT TGAGGTTGTG 
CAGCTGACAG GCTCAAGAGA TCCAAGCAAA 
AGCTTCATGG CAGTCCTATA CGCGGAGAAC 
TAAAGACTGG TAGAGCTCTG TCATTTTGGG 
GGGTCTCGTG GTTGATATAG TAGGGCACTT 
TCCCAGGGAC CCTGAACTGA AGTGGAAAGG 
AAAGTCCCTG TGGGCTTCAT GCAGCTGTCT 
CCTGTAGAAG CCTCCATCTG GTATTCAGAT 
TAAGGTGAGA GCTGAATGCC CAGTGTGGTC 
GACACGATTG ACATTCTCTT TAAGAGGTGC 
TGACTTTTTC AAGGTGATCT TGCAGAGAGT 



ATACCTTTTG 


GTTCGAAATT 


4350 


CATCTTCTCC 


GGATGTAGTC 


4400 


ATTTTATGGC 


CCTTTGCAAC 


4450 


GTuATAATTA 


AAATGCTTTA 


4500 


ACTCCTTACA 


GATGTTACAC 


4550 


GCCACTCTGT 


GCAAGACGGG 


4600 


CATCCAGTCA 


AGGAAGAGAG 


4650 


ATTGGAAGGA 


GCTCCTGACA 


4700 


GCAACTTCAC 


CCAACTGTCT 


4750 


ACCCAGCCTA 


CGCTGGTCAC 


4800 


AAAGGTATCT 


GTACTTGTCT 


4850 


PC A CTTTT AA 


AAGACAGGAC 


4900 


A AC ATTG AGA 


AGCCAGTTGA 


4950 


CCAGATTGTT 


GTGCTCTTGC 


5000 


TTAATTATCT 


GCAGGATATC 


5050 


fTGGTCCAGG 


GCATCACATG 


5100 




GAGCCPTCTG 


5150 


CFG A CATT AT 


TCAGGTCAGC 


5200 


GTGGTCCCAA 


CAAGTGGTTT 


5250 


TGTTTGGTGA 


GATGGCTCTC 


5300 


AAGTGCTGGG 


ATGCAGGACC 


5350 


GACACGGTCC 


TCCACAGCCA 


5400 


CTTCCAAAGT 


GCTGAGGTTA 


5450 


AGCTGATGTG 


CAAGGTCATT 


5500 


AATTTCTCCC 


CGAAGTGCCT 


5550 


CAATGAGGAG 


ATCCCCCACT 


5600 
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GGCTGCCAGG ATCCCTTGAT CACCTCAGCT 
TTCATCGGCA GCTTCCTGAA 6TTCCT6GAG 
TTTTTCTCTG CCAATCAGCT GAGCGCAGGT 
TTGACCTCTT CAGCCTGCTT TCGTAGGAGC 
TTCTTCAGGA GGCAGTTCTC TGGGCTCCTG 
CCAAAGGCTG CTCTGTCAGA AATATTCTCA 
ATTACAGGTT CTTTAGTTTT CAATTCCCTC 
ATTCTGCTTC TGAACTGCTG GGAAATCACC 
TCAGTTCATC ATCTTTCAGC TGTAGCCAAA 
AGATGCAAAC GCTTCCACTG GTCAGAACTT 
GTTGAGAGAC TTTTTCTGAA GTTCACTCCA 
AACGTCTTTG TAACAGGGGT GCTTCATCCG 
ATTTTTTGGC CATTTTCATC AAGATTGTGA 
AATTTCTCCT TGGAGATCTT GCCATGGTTT 
TGGAGTCTTC TAGGAGCTTC TCCTTACGGG 
GCAGTTGTTT CTGCTTCCGT AATCCAGGAA 
AGGGAACTGC TGCAGTAATC TATGAGTTTC 
CACTTACTCT TTTATGAATG TTTCCCCAAG 
ATCATGTGTA CTTTTCTGGT ATCATCAGCA 
CAGTGCCAAA TCATTTGCCA CGTCTACACT 
CTTTGGCCAA CTGCTTGGTT TCTGTGATCT 
GTGTGAGGAC CTTCTTTCCA TGAGTCAAGC 
GACCTGTTCG GCTTCTTCCT TAGCTTCCAG 
ACATTTCATT CAACTGTTGT CTCCTGTTCT 
TCCCACTGAA TCTGAATTCT TTCAATTCGA 
TTCTTGATTG CTGGTTTTGT TTTTCAAATT 



TGGCGCAACT 


TGAGGTCCAG 


5650 


TCTTTCAAGA 


GCTTCATCTA 


5700 


TCAATTTGTC 


CCATTCAGCG 


5750 


CGAGTGACAT 


TCTGAGCTCT 


5800 


GTAGAGTTTC 


TCTAGTCCTT 


5850 


CAGTCTCCAG 


AGTACTCATG 


5900 


TTGAAGGCCC 


TATGTATATC 


5950 


ACCGATGGGT 


GCCTGACGGC 


6000 


CAAGAAGTTC 


CTGAAGAGAA 


6050 


GCTTCCAAAT 


GGGACCTAAT 


6100 


CTTGAAATTC 


ATGTTATCCA 


6150 


AACCTTCCAG 


GGATCTCAGG 


6200 


TAGATATCTG 


TGTGAGTTTC 


6250 


CATP AGCFCT 
wa x w x \* x 


CTGACTCCCC 


6300 


Anuwu X V-»W X w 


TAGGACATTG 


6350 


AUAAA W 


CCAGGTCCAG 


6400 


X X w wlnnvivA 


GCCTCTTGCT 


6450 


nnul AX luni 


ATTCTCTGTT 

^&X X X %*X X A 


6500 


GAATAGTCCC 


GAAGAAGTTT 


6550 


TATCTGCCGT 


TGACGGAGGT 


6600 


TCTTTTGGAT 


TGCATCTACT 


6650 


TTGCCTCTGA 


CCTGTCCTAT 


6700 


CCATTGTGTT 


GAATCCTTTA 


6750 


GCAGCTGTTC 


TTGAACCTCA 


6800 


TCAGTAATGA 


TTGTTCTAGC 


6850 


CTGGGCAGCA 


GTAATGAGTT 


6900 
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CTTCCAATTG 
TTGATGATCA 
TGATTTTATA 
AAGCTCGGTT 
GGCATTTCTA 
CACTAGAGTA 
GGGCACGGTC 
GCCTCCCACT 
GCTTGGTTTT 
CATCCGCTTG 
GGTCCTGCCT 
AGAGACCCAC 
CTTTTAAGTG 
ACCTTTATCC 
TTGCTCTTCT 
CATTTAATTG 
AACTTGACTT 
AGTAATAGCA 
AATCTCTTTG 
CTCCTTTCTG 
TCCTAATTTT 
TTATTTCTTC 
CCAATGCCAT 
CCAGTTTTCA 
GCCATTGATT 
ACTGATCTGT 



GGGGCGTCTC 
TTTCATTGAT 
ACTCGATCAA 
GAAGTCTGCC 
GTTTGGAGAT 
ACAGTCTGAC 
AGGCTGCTTT 
CAGACCTCAG 
TCCTTATACA 
TTTACCGTGA 
GACTTGGTTG 
AGAAGCAGGT 
AACCTCAAGC 
ACTGGAGATT 
GGCCTTATGG 
TTTTAGAATT 
GTTCAAGTTG 
ATGTTATCTG 
AAATTCTGAC 
CCAGCTCTTT 
TCTTGTAGAA 
CCCAGTTGCA 
CCTGGAGTTC 
GGATTTTGTG 
AAATACCTTC 
CGAATCGCCC 



88 

TGTTCCAAAT 
GTCTTCCAGA 
GGAGAGACAG 
AGTGCAGGTA 
GACAGTTTCC 
TGGCAGAGGC 
GTCCTCAGCT 
ATCTTCTAAC 
AATGCTGCCC 
ACTGTTACTT 
GTTATAAATT 
GATCCAGCTG 
TCTCCTTGTT 
TGTCTGTTTG 
GAGCACTTAC 
CCCTGGCGCA 
TTCTTTTAGC 
CTTCTTCCAG 
AAGACATTCT 
GCAGATGTCG 
TATTGACATC 
TTCAGTGTTC 
CTTAAGATAC 
TCTTTTTGAA 
ATATCATAAT 
TTGTCGTTCC 



CTTGCAGTGT 
TCACCCACCA 
CCAGTCTGTA 
CCJCCAACAG 
TTAGTAACCA 
TCGAGTAGTG 
CCCGAAGTAA 
TTCCTCTTCA 
TTTCGACAAA 
CAATCTCCTT 
TCCAACTGGT 
CTCTTCAAGC 
TCTCAGGTAA 
AGCTTCTTTT 
AAGTACTGCT 
GGGGCAACTC 
TGCTGCTCAT 
CCACAAAACA 
TTTGTTCTTC 
TGCCACCGCA 
TGTTTTTGAA 
TGACAACAGC 
CATTTGTATT 
AAACTGTTCA 
GAAAGTGTCG 
TTGTACATTC 



TGCCTTCTGT 
TCACTCTCTG 
AGTTCTGTCC 
CAAAGAAGAT 
CAGATTGTGT 
CTCAGTCCAG 
ATGGTTTACA 
CTGGCTGAGT 
AGCCTTTCCA 
TATGTCAAAC 
TTCTAATAGG 
TGCCTAAAAT 
AGCTCTGGAG 
CAAGTTTATC 
CCTCCTGTTT 
TTCTGCCAGT 
CTCCAAGTGG 
AATTCATTTA 
AATCCTCTTT 
GACTCAAGCT 
GACTGTTGAA 
TTGACGCTGC 
TAGCATGTTC 
ACTTCATTCA 
CCATTTTTCA 
TATGAAGTTT 



6950 

7000 

7050 

7100 

7150 

7200 

7250 

7300 

7350 

7400 

7450 

7500 

7550 

7600 

7650 

7700 

7750 

7800 

7850 

7900 

7950 

8000 

8050 

8100 

8150 

8200 
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TTCCCCCTGG AAATCCATCT GTGCCACGGC 
CCATGGAGGT GGCACTTTGC AAG6CT6CT6 
TCAATCCGAC CTGAGATTTG TTGCAAATTG 
CTCCTCTTGC TTAAAAAGAT CTTCAAAATC 
TATTTAGAAG ATGATCAACT TCTGAAAGAG 
TCGGTCAAAT AAGTAGAAGG CACATAAGAA 
AGTCGTCACT ACCATAGTTT CTTCATGGAG 
TGAGTCTTCG AAACTGAGCA AAATTGCTCT 
CTGAGCTGGA TCTGAGTTGG CTCCACTGCC 
CAAGCCCTCA GCTTGCCTGC GCACTGCATT 
GCAATTCACG ATCAATTTCC TTTAATTTTC 
AGGCTGGCTA ATTTTTTTTC AATTTCATCC 
AGCCTGCCTC TTGTACTGAT ACCACTGGTG 
TTCTTCTTTG AGACCTCAAA TCCTTGAGAG 
AGCTGCTGTT TTATCTTTAT TTCCTCTCGC 
TTGTTGTAAG TTGTCTCCTC TTTGCAACAA 
TGTCTTCACT CATATCTTTA TTGAAGTCTT 
TGCTGAATTT CAGCCTCCAG TGGTTCAAGC 
AAACTGCTCC AATTCCTTCA AAGGAATGGA 
TGTGAGAAAT AGCTGCAAAT CGACGGTTGA 
ACTACTTTCC TGCAGTGGTC ACCGCGGTTT 
GTCACGTGTG GAGTCGACCT TTGGGCGCAT 
AACGCTTAAG AATGTCTTCC TTTTGTTGTG 
TCTAAAAGTT CATCTGCATG AATGATCCAC 
CTGATCAAAG GTTTCCATGT GTTTCTGGTA 
ATTCTTCTAC TCTGGAGGTG ACAGCTATCC 



TTCCTGTACT 


TTCACCTTTT 


8250 


TCTTCTTCTT 


GTGAATAATA 


8300 


TCTTTTATAT 


TCTTAAGAGA 


8350 


T.TAGCACAG AGTTCAGGAG 


8400 


CTTGTAAGAT 


ATGACTGATC 


8450 


ACATCCAAAG 


GCATATCTTC 


8500 


AGTGTGAATT 


TGTGCAAAGT 


8550 


CAATTTGCCG 


CCAGCGCTTG 


8600 


ATTGCGGCCC 


CATTCTCAGA 


8650 


CAGCTCCTCT 


TTCTTCTTCT 


8700 


TTTCATCTCT 


GGGTTCAGGT 


8750 


AAGCATTTCA 


GGAGATCATC 


8800 


AGAAATTTCT 


AGGGCCTTTT 


8850 


CATTATGTTT 


TGTCTGTAAC 


8900 


TTTCTCTCAT 


CTGTGATTCT 


8950 


TTCATTTACA 


GTACCCTCAT 


9000 


CCTCTTTCAG 


ATTCACCCCC 


9050 


AATTTTTGTA 


TATCTGAGTT 


9100 


GGCCTTTCCA 


GTCTTAATTC 


9150 


GCTCAGAGAT 


TTGGGGCTCT 




qCCATCAATT TTGCTGCTTG 


9250 


GTCATTCATT 


TCAGCCTTTA 


9300 


GTTTCTTCTT 


TTCAGACTCA 


9350 


TTTGTGATTT 


GTTCTATGTT 


9400 


TTCCAACAAA AGATTTAGCC 


9450 


AGTTACTGTT 


CAGAAGACTC 


9500 
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AGTTTATCTT CTACCAAGGT TTCTTTCTTG 
CTCTCCTAAT TCTGTAACAC TCTTCAAGTG 
CTTTTTGAGT AGCCTTTCCC CAGGCAACTT 
ATTCCTTCAA CTGCTGATCT CTTCGTCAAT 
CCATTCTGTT AAGACATTCA TTTCCTTTCT 
AGCATTTCTC CAACTGTTGC TTTCTCTCTG 
TTGTAATGCA ATTTCAAAGC TGTTACTCGT 
TTCTGTCTGC TTTTTCTGTA CAATTTGACG 
CCACTTCAGA CTTGACTTCA CTCAGGCTTT 
CTTAGTTGTG ACTGAATTAC TTCCTGTTCA 
AGGCAAATGC ATCTTGACTT CATCTAAAAT 
GTTGTTCAAA ATTGGCTGGT TTTTGGAATA 
TCTTGTAATT TTTTCTGTGC AACATCAATT 
GGCATCCTTC CCCTGGTTAT GTTTCTTCAT 
GACTTGTCAA ATCTGATTGG ATTTTCTGGG 
GCATCCACCT TGTCAGTGAT ATAAGCTGCC 
AAGCGACTCC TGAATTAAGT GCAAGGACTT 
GGATACTCTG TTCAAGCAAC TTTTGTTTCC 
TCCCTCCAAC GAGAATTAAA CGTCTCAAGC 
CATGACTCCT CCATCTGTAA GAGTCTGTGC 
GGTTCTCCTC TGAATGATGC ATCAGATTTT 
GTGATTTCCT CAGGTCCTGC AGGAACATTT 
TTCTACTTCA TTGAGCCACT TGTTTGCTTT 
CATGCCAACA TGCCCAAACT TCTTCCAAAG 
CTGGTGCACA GCCATTGGTA GTTGGTGGTC 
TAAGGCCTCT TGTGCTGAGG GTGGAGCGTG 



PPPAAPACCA 

WWAAVAwwA 


TTTTCAAAGA 

X X X X W1AAWA 


9550 


AWw W X iwlMl 


TTCTCAATCT 


9600 


C AG AAT CCAA 


ATTACTTGGC 


9650 


TCi ^TATCTG 

X W A W X«» A WX \* 


TTGCTGCCAG 


9700 


CATCTTACGG 

vA i vA A Awww 


GACAACTTCA 


9750 


TTACCTTCGC 


ACCCAACTCA 


9800 


TCATCAAGCT 


CTTTGGGATT 


9850 


TPPOnWTTA 

X WWW X X X X A 


AT C ACCATTT 


9900 


TAT A P A AGTT 

Inl AWAAU X X 


CACACAATGA 


9950 


& O & t '"PP'P'P^i^ 


TTTCCAATGC 

XXX WWAA X WW 


10000 


^ & TPTT & / •> ilfll 
WaIvI X 


X X V* X AwAW 


10050 


AX UUAAA1 X X 


PATnfZACAPA 

VA X UUAUaWV 


10100 

x u x w 


OV'TV* A A A <~ A A 
X \» IwAAAwAA 


WwX lluul X 


10150 


TTCTTCTAAA 


V»X XAXwX VAX 




CTTCU lAvAlra 


WAX X iuAuvl 


1O250 


AAv X w w i xul 


PAATGAATTC 

V*AAX WAAX X W 


10300 


1 lUUtl ilWt 


X UvUvnvAV X 


10350 


mr»2i p Afspprp 

X UAUiuC w X V-» 


X X wAXwXAWX 


10400 


TCCTCATTGA 


TCAGTTCATC 


10450 


CAATAGACGA 


ATCTGATTTG 


10500 


CAAGAGATTC 


TAGCACTTCA 


10550 


TCCATGGTTT 


TAAGTTTCAA 


10600 


CTCTAAATAT 


GACAATAACT 


10650 


TTTTGCATTT 


TCCATTCAGC 


10700 


AGAGTTTCAA GTTCCTTTTT 


10750 


AGCTATTACA 


CTATTTACAG 


10800 
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TCTCA6TAAG GAGTTTCACT TTAGTTTCTT 
GCTCTCTTCA TTTCTTCAAC AGCAGTCTGT 
TTCAAAATCT CTCTCTAGAT ATTCTTCTTC 
GCATCTCTGA TAGATCTTTT TGGAGGCTTA 
TTTAAGGCTT CCTTTCTGGT GTAGACCTGG 
AGTGTTAAGC TCTCTAAGTT CTGTCTCCAG 
CAGCTTCACT CTTTATCTTC TGCCCACCTT 
GGCTGAATTG TTTGAATATC ACCAACTAAA 
TTTTTTCAGG ATTTCAGCAT CCCCCAGGGC 
AAACATCAAC TTCAGCCATC CATTTCTGTA 
AATTTTCGAA GTTTATTCAT ATGTTCTTCT 
CAACTGGGAG GAAAGTTTCT TCCAGTGCCC 
ACAGATATTT CTGGCATATT TCTGAAGGTG 
ACAGTGTCAC TCAGATAGTT GAAGCCATTT 
TTGCAGAGCC TGTAATTTCC CGAGTCTCTC 
TAACACTAAG ATAAGGTACA GAGAGTTTGC 
GTCCTGATGC TACTCATTGT CTCCTGATAG 
AAAAATTGTC TGTAGCTCTT TCTCTTTGGC 
GGTTAAAATG ATTAGTAAAG GCCACAAAGT 
CCCTGTCCCT TTTCTTTCAG TTGTAGACTC 
TTGAGGCTGA AGAGCTGACA ATCTGTTGAC 
ACTGGCTTTT AATTGCTGTT GGCTCTGATA 
AACAAGTTTT CGGCAGTAGT TGTCATCTGT 
ATAAAAGGTA ATGATGTTGG TTTGATACTC 
TCAGCAATTG GCAGAATTCT GTCCACCGGC 
TGTCTGATAC TTTCAGCATT AACACCCTCA 



PCT/US95/14Q17 



X X XvXawxww 


PTOTTCTTTA 
w x wx x wx x in 


10850 


AAl iUAiWIVJ 


CAcinTATA 

WA w X X X X *» X *» 


10900 


Au^> X X\» lv X w 


ATPPACTCAT 

AX WWAW A WAX 


10950 


wVWVX X X 4A1V# 


CAAACCTGCC 
wuinww x www 


11000 


CGGCATATGT 


GATCCCACTG 


11050 


TCTGGATGCA 


AACTCAAGTT 


11100 


WAX i/UlWlv X 


ATTTAAACTG 


11150 


Aw x x w wa X X 


GTTTGAGCTG 

A A ******** 


11200 


AGGCCATTCC 


TCTTTCA6GA 


11250 


AWX XXX XAX 


GTGATTCTGA 


11300 


Awwi. X llbut 


Aw w XXX vwtv 


11350 




X vAAAX X V*lo 


1 1400 


CTTTCTTGGC 




1 1 ARM 

111 9w 


TGTTGCTCTT 


1 UlAA bAA v X 




CTC CA J. X A 1 1 


X VAX AX X wAw 




TTTCTGACTG 


CxatvA X LtAt 


1 1 Ann 


CGCATTGGTG 


wlAAAwXvX V 




CCTCACACCA 


X vAAAwAxu X 




CTGCAX CwU» 


AAA VAX 1WIV# 


11750 


TGAATTTTTA 


ATTGCTCAAT 


11800 


TTCATCCTTA 


CAAATTTTTA 


11850 


GGGTGGTAGA 


CTGGGTTTTC 


11900 


TCCAATTGTT 


GTAGCTGATT 


11950 


TAGCCAGTTA 


ACTCTCTCAC 


12000 


TGTTCAGTTG 


TTCTGAAGCT 


12050 


TTTGCCATCT 


GTTCCACCAG 


12100 
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92 



GGCCTGA6CT 


GATCTGCTGC 


CATCTTGCAG 


TTTTCTGAAC 


TTCTCTGCTT 


12150 


TTTCTCGTGC 


TATGGCATTG 


ACTTTTTCTT 


GCAAGTCTGA 


GATGTTGCCT 


12200 


TCTTTTCGAT 


AGACTGCAAA 


TTCAGAACTC 


TGTAATACAG 


CTTCTGAACG 


12250 


AGTAATCCAA 


CTGTGAAGTT 


CAGTTATATC 


GAOATCCAAC 


CTTTTCCTGA 


12300 


GTTCAGAATC 


CACAGTTATC 


TGCCTCTTCT 


TTTGAGGAGG 


TGGTGGTGGA 


12350 


AGTTCCTCTT 


GGG CATGTTT 


TACCATGATT 


TGTTCCCTTG 


TGGTCACCAT 


12400 


AGTTACCGTT 


TCCATTACAG 


TTGTCTGTGT 


TAGGGATGGT 


TGAGTGGTGG 


12450 


TGACAGCCTG 


TGAAATTTGT 


GCTGAACTCT 


TTTCAAGTTT 


TTGGGTTAAA 


12500 


TTGTCCCAAC 


GTTGTGCAAA 


GTTTTCCATC 


CAGATTTCCA 


TCTTTTGAGT 


12550 


WiuiuAUl x*» 


X X x A A w*% w x *J 


CCGAAAGTAG 


ATCTTGATTG 


AGTGAACTTA 


12600 


/ *imiH|M|M|l^^^^l 

Ijlllil tvAl 


(^( , ^ , F , Tt , id^! , l v PT 

ww X XWWwX X X 


TTCTTTTCTA 


GATCTATTTT 


TAAAGTAGAT 


12650 


AX X X XwXw.fl/* 


GACTTGACAT 

unw x x unwn x 


CATTTCATTT 


TGATCTTTAA 


AGCCACTTGT 


12700 


W X UAA 1V71 1 v» 


TTCATTGCAT 

X X VAA X WW** A 


CTTCTTTTTC 


TGAAAGCCAT 


GTACTAAAAA 


12750 


uxjvaUI o x x w 


TTCAGTAAAA 

X X vXlv X 


TGCTGCCATT 


TTAGAAGAAT 


ATCTTGTAAA 


12800 


awAa X V»WVw\- 


U\9 X W X X w*»V# X 


CCATCTGCAG 


ATATTTGCCC 


ATCGATCTCC 


12850 


CAGTACCTTA 


AGTTGTTCTT 


CCAAAGCAGC 


TGTTGCATGA 


TCACCGCTGG 


12900 


ATTCATCAAC 


CACTACTACC 


ATGTGAGTGA 


GCGAGTTGAC 


CCTGACCTGC 


12950 


TCCTGTTCTA 


GATCTTCTTG 


AAGCACCTTA 


TGTTGTTGTA 


CTTGGCATTT 


13000 


TAGATCTTCA 


AGATCAGGTC 


CAAAGGGCTC 


TTCCTCCATT 


TTCTTAGTTC 


13050 


TCTCTTCAGT 


TTTTGTTAAC 


CAGTCATCTA 


GTTCTTTTAA 


TTTCTGATTC 


13100 


TGGAGATCCA 


TTAGAACTTT 


GTGTAATTTG 


CTTTGTTTTT 


CCATGCTAGC 


13150 


TACCCTGAGA 


CATTCCCATC 


TTGAATTTAG 


GAGATTCATT 


TGTTCTTGCA 


13200 


CTTCAGCTTC 


TTCATCTTCT 


GATAATTTCC 


CTTTTCCAAC 


TAGTTGACTT 


13250 


CCTAACTGTA 


GAACATTACC 


AACAAGTCCT 


TGATGAGATG 


TCAGATCCAT 


13300 


CATGAATCCC 


TCATGAGCAT 


GAAACTGTTC 


TTTCACTTCT 


TCAACATCAT 


13350 


TTGAAATCTC 


TCCTTGTGCT CGCAATGTAT CCTCGGCAGA AAGAAGCCAT 


13400 



WOW/13597 



PCT/US95/14017 



93 

GAAAGTACTT CTTCTAAAGC AGTTTGGTAA 
CTCCATCAAT GAACTGTCAA GTGACTTGTC 
GTGAAGGATA GGGGCTCTGT GTGGAATCAG 
TGTGTGAAGG CATAACTCTT 6AATC6AGGC 
TTCATAGCCC TGTGCTAGAC TGACTGTGAT 
66T6AT6TAA TTGAAAATGT TCTTCTCTAG 
GGCAACATTT CCACTTCTTG AATGGCTTCA 
AACTTGAAAG AGTGATGTGA TGTACATTAA 
AAGTGGTAGC AACATCTTCA GGATCAAGAA 
CATTTTGCAA TGTTGAAGGC ATGTTCCAGT 
TGAAACCACA CTATTCCAAT CAAACAGGTC 
GAGCATTCAA AGCCAACCCG TCGGACCAGC 
TTAACCTGTG GATAATTACG TGTTGACTGT 
CTTTTCACTG TTGGTTTGCT GCAATCCAGC 
TTTTGACCTG CCAGTGGAGG ATTATATTCC 
TTATGATTTC CATCCACTAT GTCAGTGCTT 
ATTATTTTTC TGTAAGACCC GCAGTGCCTT 
GAACTCTTGT AGATCCCTTT TCTTTTGGCA 
TCCAAGAGGT CTAGGAGGCG TTTTCCATCC 
GTCTATGTGT TGCTTTCCAA ACTTAGAAAA 
TGAATGTTTT CTTTTGAACA TCTTCTCTTT 
TCCCACCAAA GCATTTGGAA GAAAAAGTAT 
ATCTTGGTAA AAGTTTCTCC CAGTTTTATT 
GATGAGAAGC CAATAAACTT CAGCAGCCTT 
TAGCACTTCA AGTCTTCCTA TTCGTTTTTT 
AGA6CG6AAT TCCTGCAGCC CGGGGGATCC 



CTATCCAGAT 


TTACTTCCGT 


13450 


TCTGGGAGCT 


TCGAAATGCT 


13500 


AGGTGGCAAC 


ATAAGCAGCC 


13550 


TaAGGAGATG 


AAGAAGTTTG 


13600 


CTGTTGAGAG 


TAATGCATCT 


13650 


TTACTTTTGA 


AGATGTCCTG 


13700 


ATGCTCACTT 


GTTGTGGCAA 

A> A> A> ^^A A« a 


13750 


GATGGACTTC 
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GGGTACAATT CCGCAGCTTT TAGAGCAGAA GTAACACTTC CGTACAGGCC 14750 

TAGAAGTAAA GGCAACATCC ACTGAGGAGC AGTTCTTTGA TTTGCACCAC 14800 

CACCGGATCC GGGACCTGAA ATAAAAGACA AAAAGACTAA ACTTACCAGT 14850 

TAACTTTCTG GTTTTTCAGT TCCTCGAGTA CO.GATCCTC TAGAGTCCGG 14900 

AGGCTGGATC GGTCCCGGTG TCTTCTATGG AGGTCAAAAC AGCGTGGATG 14950 

GCGTCTCCAG GCGATCTGAC GGTTCACTAA ACGAGCTCTG CTTATATAGA 15000 

CCTCCCACCG TACACGCCTA CCGCCCATTT GCGTCAATGG GGCGGAGTTG 15050 

TTACGACATT TTGGAAAGTC CCGTTGATTT TGGTGCCAAA ACAAACTCCC 15100 

ATTGACGTCA ATGGGGTGGA GACTTGGAAA TCCCCGTGAG TCAAACCGCT 15150 

ATCCACGCCC ATTGATGTAC TGCCAAAACC GCATCACCAT GGTAATAGCG 15200 

ATGACTAATA CGTAGATGTA CTGCCAAGTA GGAAAGTCCC ATAAGGTCAT 15250 

GTACTGGGCA TAATGCCAGG CGGGCCATTT ACCGTCATTG ACGTCAATAG 15300 

GGGGCGTACT TGGCATATGA TACACTTGAT GTACTGCCAA GTGGGCAGTT 15350 

TACCGTAAAT ACTCCACCCA TTGACGTCAA TGGAAAGTCC CTATTGGCGT 15400 

TACTATGGGA ACATACGTCA TTATTGACGT CAATGGGCGG GGGTCGTTGG 15450 

GCGGTCAGCC AGGCGGGCCA TTTACCGTAA GTTATGTAAC GACCTGCAGG 15500 

TCGACTCTAG AGGATCTCCC TAGACAAATA TTACGCGCTA TGAGTAACAC 15550 

AAAATTATTC AGATTTCACT TCCTCTTATT CAGTTTTCCC GCGAAAATGG 15600 

CCAAATCTTA CTCGGTTACG CCCAAATTTA CTACAACATC CGCCTAAAAC 15650 

CGCGCGAAAA TTGTCACTTC CTGTGTACAC CGGCGCACAC CAAAAACGTC 15700 

ACTTTTGCCA CATCCGTCGC TTACATGTGT TCCGCCACAC TTGCAACATC 15750 

ACACTTCCGC CACACTACTA CGTCACCCGC CCCGTTCCCA CGCCCCGCGC 15800 

CACGTCACAA ACTCCACCCC CTCATTATCA TATTGGCTTC AATCCAAAAT 15850 

AAGGTATATT ATTGATGATG CTAGCGGGGC CCTATATATG GATCCAATTG 15900 

CAATGATCAT CATGACAGAT CTGCGCGCGA TCGATATCAG CGCTTTAAAT 15950 

TTGCGCATGC TAGCTATAGT TCTAGAGGTA COGGTTGTTA ACGTTAGCCG 16000 



WO 96/135!>7 



PCT/US95/14017 



95 

GCTACGTATA CTCCGGAATA TTAATAGGCC 
GCC6CCT6CA 6CTGGCGCCA TCGATACGCG 
GTACAGAGCT CGAGAAGTAC TAGTGGCCAC 
TTGGCACTGG CCGTCGTTTT ACAACGTCGT 
TACCCAACTT AATCGCCTTG CAGCACATCC 
ATAGCGAAGA GGCCCGCACC GATCGCCCTT 
AATGGCGAAT GGCGCCTGAT GCGGTATTTT 
TATTTCACAC CGCATACGTC AAAGCAACCA 
CGCATTAAGC GCGGCGGGTG TGGTGGTTAC 
TTGCCAGCGC CCTAGCGCCC GCTCCTTTCG 
GCCACGTTCG CCGGCTTTCC CCGTCAAGCT 
AGGGTTCCGA TTTAGTGCTT TACGGCACCT 
TGGGTGATGG TTCACGTAGT GGGCCATOGC 
CCTTTGACGT TGGAGTCCAC GTTCTTTAAT 
TGGAACAACA CTCAACCCTA TCTCGGGCTA 
TTTTGCCGAT TTCGGCCTAT TGGTTAAAAA 
TTTAACGCGA ATTTTAACAA AATATTAACG 
TCTCAGTACA ATCTGCTCTG ATGCCGCATA 
CGCCAACACC CGCTGACGCG CCCTGACGGG 
GCTTACAGAC AAGCTGTGAC CGTCTCCGGG 
TTCACCGTCA TCACCGAAAC GCGCGAGACG 
TATTTTTATA GGTTAATGTC ATGATAATAA 
GGCACTTTTC GGGGAAATGT GCGCGGAACC 
AATACATTCA AATATGTATC CGCTCATGAG 
TTCAATAATA TTGAAAAAGG AAGAGTATGA 
GCCCTTATTC CCTTTTTTGC GGCATTTTGC 
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AGAAACGCTG GTGAAAGTAA AAGATGCTGA AGATCAGTTG GGTGCACGAG 17350 

TGGGTTACAT CGAACTGGAT CTCAACAGCG GTAAGATCCT TGAGAGTTTT 17400 

CGCCCCGAAG AACGTTTTCC AATGATGAGC ACTTTTAAAG TTCTGCTATG 17450 

TGGCGCGGTA TTATCCCGTA TTGACGCCGG GCAAGAGCAA CTCGGTCGCC 17500 

GCATACACTA TTCTCAGAAT GACTTGGTTG AGTACTCACC AGTCACAGAA 17550 

AAGCATCTTA CGGATGGCAT GACAGTAAGA GAATTATGCA GTGCTGCCAT 17600 

AACCATGAGT GATAACACTG CGGCCAACTT ACTTCTGACA ACGATCGGAG 17650 

GACCGAAGGA GCTAACCGCT TTTTTGCACA ACATGGGGGA TCATGTAACT 17700 

CGCCTTGATC GTTGGGAACC GGAGCTGAAT GAAGCCATAC CAAACGACGA 17750 

GCGTGACACC ACGATGCCTG TAGCAATGGC AACAACGTTG CGCAAACTAT 17800 

TAACTGGCGA ACTACTTACT CTAGCTTCCC GGCAACAATT AATAGACTGG 17850 

ATGGAGGCGG ATAAAGTTGC AGGACCACTT CTGCGCTCGG CCCTTCCGGC 17900 

TGGCTGGTTT ATTGCTGATA AATCTGGAGC CGGTGAGCGT GGGTCTCGCG 17950 

GTATCATTGC AGCACTGGGG CCAGATGGTA AGCCCTCCCG TATCGTAGTT 18000 

ATCTACACGA CGGGGAGTCA GGCAACTATG GATGAACGAA ATAGACAGAT 18050 

CGCTGAGATA GGTGCCTCAC TGATTAAGCA TTGGTAACTG TCAGACCAAG 18100 

TTTACTCATA TATACTTTAG ATTGATTTAA AACTTCATTT TTAATTTAAA 18150 

AGGATCTAGG TGAAGATCCT TTTTGATAAT CTCATGACCA AAATCCCTTA 18200 

ACGTGAGTTT TCGTTCCACT GAGCGTCAGA CCCCGTAGAA AAGATCAAAG 18250 

GATCTTCTTG AGATCCTTTT TTTCTGCGCG TAATCTGCTG CTTGCAAACA 18300 

AAAAAACCAC CGCTACCAGC GGTGGTTTGT TTGCCGGATC AAGAGCTACC 18350 

AACTCTTTTT CCGAAGGTAA CTGGCTTCAG CAGAGCGCAG ATACCAAATA 18400 

CTGTTCTTCT AGTGTAGCCG TAGTTAGGCC ACCACTTCAA GAACTCTGTA 18450 

GCACCGCCTA CATACCTCGC TCTGCTAATC CTGTTACCAG TGGCTGCTGC 18500 

CAGTGGCGAT AAGTCGTGTC TTACCGGGTT GGACTCAAGA CGATAGTTAC 18550 

CGGATAAGGC GCAGCGGTCG GGCTGAACGG GGGGTTCGTG CACACAGCCC 18600 
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AGCTTGGAGC GAACGACCTA GACCGAACTG 
ATGAGAAAGC GCCACGCTTC CCGAAGGGAG 
TAAGCGGCAG GGTCGGAACA GGAGAGCGCA 
AACGCCTGGT ATCTTTATAG TCCTGTCGGG 
GCGTCGATTT TTGTGATGCT CGTCAGGGGG 
CCAGCAACGC GGCCTTTTTA CGGTTCCTGG 
CACATGTTCT TTCCTGCGTT ATCCCCTGAT 
CGCCTTTGAG TGAGCTGATA CCGCTCGCCG 
GCGAGTCAGT GAGCGAGGAA GCGGAAGAGC 
CTCCCCGCGC GTTGGCCGAT TCATTAATGC 
CGACTGGAAA GCGGGCAGTG AGCGCAACGC 
CTCATTAGGC ACCCCAGGCT TTACACTTTA 
TGTGGAATTG TGAGCGGATA ACAATTTCAC 
TGATTACGAA TTCGAATGGC CATGGGACGT 
CCGGGCC 
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WHAT IS CLAIMED IS: 

1. A recombinant shuttle vector comprising: 

(a) the DNA sequences of, or corresponding to, 
a portion of the genome of an auenovirus which comprises 
DNA sequences of, or corresponding to, the adenovirus 5' 
and 3 1 inverted terminal repeats and packaging/enhancer 
domain necessary for replication and virion encapsidation 
in the absence of sequence encoding viral genes; 

(b) a selected gene operatively linked to 
regulatory sequences directing its expression, said gene 
operatively linked to the DNA of (a) and capable of 
expression in a target cell in vivo or in vitro. 

2. The vector according to claim 1 wherein said 
DNA sequences (a) comprise the native adenovirus 5' 
inverted terminal repeats and packaging sequences. 

3 . The vector according to claim 1 wherein said 
DNA sequences (a) comprise the native adenovirus 3 1 
inverted terminal repeat sequences. 

4. The vector according to claim 1 wherein said 
selected gene (b) is a reporter gene. 

5. The vector according to claim 4 wherein said 
reporter gene is selected from the group consisting of 
the genes encoding B-galactosidase, alkaline phosphatase 
and green fluorescent protein. 

6. The vector according to claim 1 wherein said 
selected gene (b) is a therapeutic gene. 
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7. The vector according to claim 6 wherein said 
therapeutic gene is selected from the group consisting of 
a normal CFTR gene, a DMD Becker allele and a normal LDL 
gene. 

8. A crippled adenovirus helper virus comprising a 
modified adenovirus sequence in place of native 
adenovirus sequence map units 0-1, which modification 
reduces the packaging efficiency of said virus, said 
virus also containing selected adenovirus genes necessary 
to direct a productive viral infection. 

9. The helper virus according to claim 8 wherein 
said modified sequence comprises: 

i. a fragment of adenovirus map units 0-1; 

ii. a fragment of (i) containing a 5 f inverted 
terminal repeat and between one to four selected 
packaging sequences, 

iii. a modified fragment of (i) containing at 
least one PAC consensus sequence in place of at least one 
native PAC sequence; and 

iv. a modified fragment of (ii) , wherein said 
native PAC sequences are mutated to contain modified 
sequences. 

10. The virus according to claim 8 wherein said 
modified sequence comprises Ad5 base pairs 1-269. 

11. The virus according to claim 8 wherein said 
sequence (ii) comprises Ad5 base pairs 1-321. 

12. The virus according to claim 8 wherein said 
helper adenovirus is conjugated to a poly-cation 
sequenc • 



WO 96/13597 PCT/US95/14017 



100 

13. A method for producing a recombinant adenovirus 
which comprises transf ecting a selected host cell with 

(a) a recombinant shuttle vector comprising 

i. the DNA sequences of , or 
corresponding to, a portion of Uie genome of an 
adenovirus which comprises adenovirus 5 1 and 3' cis- 
elements necessary for replication and virion 
encapsidation in the absence of sequence encoding viral 
genes ; and 

ii. a selected gene operatively linked to 
regulatory sequences directing its expression, said gene 
linked to the DNA of (a) and capable of expression in a 
target cell in vivo or in vitro; and 

(b) a helper adenovirus comprising sufficient 
adenovirus gene sequences necessary for a productive 
viral infection, wherein said transf ected host cell 
permits the formation of a recombinant virus comprising 
the DNA of (i) and (ii) in an adenoviral capsid, and 

isolating and purifying the recombinant virus from 
said cell. 

14. The method according to claim 13, wherein said 
helper virus is a crippled helper virus comprising a 
modified adenovirus sequence in place of native 
adenovirus sequence map units 0-1, which modification 
reduces the packaging efficiency of said helper virus, 
said helper virus also containing selected adenovirus 
genes necessary to direct a productive viral infection. 

15. The method according to claim 13 wherein said 
helper adenovirus is associated with a poly-cation 
sequence. 
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16. The method according to claim 13 wherein said 
vector is associated with said helper adenovirus 
conjugate in a single particle. 

17. The method according to claim 13 wherein said 
helper virus is an adenovirus sequence containing 
deletions of all or portions of the Ela and Elb genes. 

18. The method according to claim 13 wherein said 
helper virus is an adenovirus sequence containing 
deletions of all or a portion of the E3 gene. 

19. A recombinant adenovirus comprising 

i. the DNA of, or corresponding to, a 
portion of the genome of an adenovirus which comprises 
adenovirus 5 9 and 3 • cis-elements necessary for 
replication and virion encapsidation in the absence of 
sequence encoding viral genes; 

ii. a selected gene operatively linked to 
regulatory sequences directing its expression, said gene 
linked to the DNA of (a) and capable of expression in a 
target cell in vivo or in vitro; 

said DNA and gene encapsidated in an adenoviral 
caps id . 

20. The virus according to claim 19 wherein said 
viral caps id is a capsid of an adenovirus serotype 
selected from the group consisting of types 2, 4, 5, 7, 
12 and 40. 

21. The virus according to claim 19 wherein said 
selected gene is a CFTR gene, a DMD gene and an LDL gene. 
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22. The use of a recombinant adenovirus according 
to claim 19 for the manufacture of a pharmaceutical 
composition suitable for delivering and integrating a 
selected gene into the chromosome of a target cell. 
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FIGURE 3A 
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FIGURE 3B 
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CCACATATCC 


3600 


TGATCTTCCA 


GATAACTGCC 


GTCACTCCAA 


CGCAGCACCA 


TCACCGCGAG 


3650 


GCGGTTTTCT 


CCGGCGCGTA 


AAAATGCGCT 


CAGGTCAAAT 


TCAGACGGCA 


3700 


AACGACTGTC 


CTGGCCGTAA 


CCGACCCAGC 


GCCCGTTGCA 


CCACAGATGA 


3750 


AACGCCGAGT 


TAACGCCATC 


AAAAATAATT 


CGCGTCTGGC 


CTTCCTGTAG 


3800 


CCAGCTTTCA 


TCAACATTAA 


ATGTGAGCGA 


GTAACAACCC 


GTCGGATTCT 


3850 


CCGTGGGAAC 


AAACGGCGGA 


TTGACCGTAA 


TGGGATAGGT 


TACGTTGGTG 


3900 


TAGATGGGCG 


CATCGTAACC 


GTGCATCTGC 


CAGTTTGAGG 


GGACGACGAC 


3950 
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AGTATCGGCC 


TCAGGAAGAT 


CGCACTCCAG 


CCAGCTTTCC 


GGCACCGCTT 


4000 


CTGGTGCCGG 


AAACCAGGCA 


AAGCGCCATT 


CGCCATTCAG 


GCTGCGCAAC 


4050 


TGTTGGGAAG 


GGCGATCGGT 


GCGGGCCTCT 


TCSCTATTAC 


GCCAGCTGGC 


4100 


CAAAGGGGGA 


TGTGCTGCAA 


GGCGATTAAG 


TTGGGTAACG 


CCAGGGTTTT 


4150 


CCCAGTCACG 


ACGTTGTAAA 


ACGACGGGAT 


CGCGCTTGAG 


CAGCTCCTTG 


4200 


CTGGTGTCCA 


GACCAATGCC 


TCCCAGACCG 


GCAACGAAAA 


TCACGTTCTT 


4250 


GTTGGTCAAA 


GTAAACGACA 


TGGTGACTTC 


TTTTTTGCTT 


TAGCAGGCTC 


4300 


TTTCGATCCC 


CGGGAATTGC 


GGCCGCGGGT 


ACAATTCCGC 


AGCTTTTAGA 


4350 


GCAGAAGTAA 


CACTTCCGTA 


CAGGCCTAGA 


AGTAAAGGCA 


ACATCCACTG 


4400 


AGGAGCAGTT 


CTTTGATTTG 


CACCACCACC 


GGATCCGGGA 


CCTGAAATAA 


4450 


AAGACAAAAA 


GACTAAACTT 


ACCAGTTAAC 


TTTCTGGTTT 


TTCAGTTCCT 


4500 


CGAGTACCGG 


ATCCTCTAGA 


GTCCGGAGGC 


TGGATCGGTC 


CCGGTCTCTT 


4550 


CTATGGAGGT 


CAAAACAGCG 


TGGATGGCGT 


CTCCAGGCGA 


TCTGACGGTT 


4600 


CACTAAACGA 


GCTCTGCTTA 


TATAGACCTC 


CCACCGTACA 


CGCCTACCGC 


4650 


CCATTTGCGT 


CAATGGGGCG 


GAGTTGTTAC 


GACATTTTGG 


AAAGTCCCGT 


4700 


TGATTTTGGT 


GCCAAAACAA 


ACTCCCATTG 


ACGTCAATGG 


GGTGGAGACT 


4750 


TGGAAATCCC 


CGTGAGTCAA 


ACCGCTATCC 


ACGCCCATTG 


ATGTACTGCC 


4800 


AAAACCGCAT 


CACCATGGTA 


ATAGCGATGA 


* ■ 

CTAATACGTA 


GATGTACTGC 


4850 


CAAGTAGGAA 


AGTCCCATAA 


UUiwllVSAAv 


luuViWilnni 






CCATTTACCG 


TCATTGACGT 


CAATAGGGGG 


CGTACTTGGC 


ATATGATACA 


4950 


CTTGATGTAC 


TGCCAAGTGG 


GCAGTTTACC 


GTAAATACTC 


CACCCATTGA 


5000 


CGTCAATGGA 


AAGTCCCTAT 


TGGCGTTACT 


ATGGGAACAT 


ACGTCATTAT 


5050 


TGACGTCAAT 


GGGCGGGGGT 


CGTTGGGCGG 


TCAGCCAGGC 


GGGCCATTTA 


5100 


CCGTAAGTTA 


TGTAACGACC 


TGCAGGTCGA 


CTCTAGAGGA 


TCTCCCTAGA 


5150 


CAAATATTAC 


GCGCTATGAG 


TAACACAAAA 


TTATTCAGAT 


TTCACTTCCT 


5200 


CTTATTCAGT 


TTTCCCGCGA 


AAATGGCCAA 


ATCTTACTCG 


GTTACGCCCA 


5250 
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AATTTACTAC 


AACATCCGCC 


TAAAACCGCG 


CGAAAATTGT 


CACTTCCTGT 


5300 


GTACACCGGC 


GCACACCAAA 


AACGTCACTT 


TTGCCACATC 


CGTCGCTTAC 


5350 


ATGTGTTCCG 


CCACACTTGC 


AACATCACAC 


TTCCGCCACA 


CTACTACGTC 


5400 


ACCCGCCCCG 


TTCCCACGCC 


CCGCGCCACG 


TCACAAACTC 


CACCCCCTCA 


5450 


TTATCATATT 


GGCTTCAATC 


CAAAATAAGG 


TATATTATTG 


ATGATGCTAG 


5500 


CGAATTCATC 


GATGATATCA 


GATCTGCCGG 


TCTCCCTATA 


GTGAGTCGTA 


5550 


TTAATTTCGA 


TAAGCCAGGT 


TAACCTGCAT 


TAATGAATCG 


GCCAACGCGC 


5600 


GGGGAGAGGC 


GGTTTGCGTA 


TTGGGCGCTC 


TTCCGCTTCC 


TCGCTCACTG 


5650 


ACTCGCTGCG 


CTCGGTCGTT 


CGGCTGCGGC 


GAGCGGTATC 


AGCTCACTCA 


5700 


AAGGCGGTAA 


TACGGTTATC 


CACAGAATCA 


GGGGATAACG 


CAGGAAAGAA 


5750 


CATGTGAGCA 


AAAGGCCAGC 


AAAAGGCCAG 


GAACCGTAAA 


AAGGCCGCGT 


5800 


TGCTGGCGTT 


TTTCCATAGG 


CTCCGCCCCC 


CTGACGAGCA 


TCACAAAAAT 


5850 


CGACGCTCAA 


GTCAGAGGTG 


GCGAAACCCG 


ACAGGACTAT 


AAAGATACCA 


5900 


GGCGTTTCCC 


CCTGGAAGCT 


CCCTCGTGCG 


CTCTCCTGTT 


CCGACCCTGC 


5950 


CGCTTACCGG 


ATACCTGTCC 


GCCTTTCTCC 


CTTCGGGAAG 


CGTGGCGCTT 


6000 


TCTCAATGCT 


CACGCTGTAG 


GTATCTCAGT 


TCGGTGTAGG 


TCGTTCGCTC 


6050 


CAAGCTGGGC 


TGTGTGCACG 


AACCCCCCGT 


TCAGCCCGAC 


CGCTGCGCCT 


6100 


TATCCGGTAA 


CTATCGTCTT 


GAGTCCAACC 


CGGTAAGACA 


CGACTTATCG 


6150 


CCACTGGCAG 


CAGCCACTGG 


TAACAGGATT 


AGCAGAGCGA 


GGTATGTAGG 


6200 


CGGTGCTACA 


GAGTTCTTGA 


AGTGGTGGCC 


TAACTACGGC 


TACACTAGAA 


6250 




X WX AX \* X SJ\» 


G CTCTG CTGA 


AGCCAGTTAC 


CTTCGGAAAA 


6300 


AGAGTTGGTA 


GCTCTTGATC 


CGGCAAACAA 


ACCACCGCTG 


CTAGCGGTGG 


6350 


TTTTTTTGTT 


TGCAAGCAGC 


AGATTACGCG 


CAGAAAAAAA 


GGATCTCAAG 


6400 


AAGATCCTTT 


GATCTTTTCT 


ACGGGGTCTG 


ACGCTCAGTG 


GAACGAAAAC 


6450 


TCACGTTAAG 


GGATTTTGGT 


CATGAGATTA 


TCAAAAAGGA 


TCTTCACCTA 


6500 


GATCCTTTTA 


AATTAAAAAT 


GAAGTTTTAA 


ATCAATCTAA 


AGTATATATG 


6550 
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AGTAAACTTG 


GTCTGACAGT 


TACCAATGCT 


TAATCAGTGA 


GGCACCTATC 


6600 


TCAGCGATCT 


GTCTATTTCG 


TTCATCCATA 


GTTGCCTGAC 


TCCCCGTCGT 


6650 


GTAGATAACT 


ACGATACGGG 


AGGGCTTACC 


ATCTGGCCCC 


AGTGCTGCAA 


6700 


TGATACCGCG 


AGACCCACGC 


TCACCGGCTC 


CJ 1ATTTATC 


AGCAATAAAC 


6750 


CAGCCAGCCG 


GAAGGGCCGA 


GCGCAGAAGT 


GGTCCTGCAA 


CTTTATCCGC 


6800 


CTCCATCCAG 


TCTATTAATT 


GTTGCCGGGA 


AGCTAGAGTA 


AGTAGTTCGC 


6850 


CAGTTAATAG 


TTTGCGCAAC 


GTTGTTGCCA 


TTGCTACAGG 


CATCGTGGTG 


6900 


TCACGCTCGT 


CGTTTGGTAT 


GGCTTCATTC 


AGCTCCGGTT 


CCCAACGATC 


6950 


AAGGCGAGTT 


ACATGATCCC 


CCATGTTGTG 


CAAAAAAGCG 


GTTAGCTCCT 


7000 


TCGGTCCTCC 


GATCGTTGTC 


AGAAGTAAGT 


TGGCCGCAGT 


GTTATCACTC 


7050 


ATGGTTATGG 


CAGCACTGCA 


TAATTCTCTT 


ACTGTCATGC 


CATCCGTAAG 


7100 


ATGCTTTTCT 


GTGACTGGTG 


AGTACTCAAC 


CAAGTCATTC 


TGAGAATAGT 


7150 


GTATGCGGCG 


ACCGAGTTGC 


TCTTGCCCGG 


CGTCAATACG 


GGATAATACC 


7200 


GCGCCACATA 


GCAGAACTTT 


AAAAGTGCTC 


ATCATTGGAA 


AACGTTCTTC 


7250 


GGGGCGAAAA 


CTCTCAAGGA 


TCTTACCGCT 


GTTGAGATCC 


AGTTCGATGT 


7300 


AACCCACTCG 


TGCACCCAAC 


TGATCTTCAG 


CATCTTTTAC 


TTTCACCAGC 


7350 


GTTTCTGGGT 


GAGCAAAAAC 


AGGAAGGCAA 


AATGCCGCAA 


AAAAGGGAAT 


7400 


AAGGGCGACA 


CGGAAATGTT 


GAATACTGAT 


ACTCTTCCTT 


TTTCAATATT 


7450 


ATTGAAGCAT 


TTATCAGGGT 


TATTGTCTCA 


TGAGCGGATA 


CATATTTGAA 


7500 


TGTATTTAGA 


AAAATAAACA 


AATAGGGGTT 


CCGCGCACAT 


TTCCCCGAAA 


7550 


AGTGCCACCT 


GACGTCTAAG 


AAACCATTAT 

flAAVWAA X Mm. X 


TATCATGACA 
x a x v*a x unwi 


TTAACCTATA 

x x x^*»\^%* x n x ** 


7600 


AAAATAGGCG 


TATCACGAGG 


CCCTTTCGTC 


TCGCGCGTTT 


CGGTGATGAC 


7650 


GGTGAAAACC 


TCTGACACAT 


GCAGCTCCCG 


GAGACGGTCA 


CAGCTTGTCT 


7700 


GTAAGCGGAT 


GCCGGGAGCA 


GACAAGCCCG 


TCAGGGCGCG 


TCAGCGGGTG 


7750 


TTGGCGGGTG 


TCGGGGCTGG 


CTTAACTATG 


CGGCATCAGA 


GCAGATTGTA 


7800 


CTGAGAGTGC 


ACCATATGGA 


CATATTGTCG 


TTAGAACGCG 


GCTACAATTA 


7850 


ATACATAACC 


TTATGTATCA 


TACACATACG 


ATTTAGGTGA 


CACTATA 


7897 
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FIGURE 5A 



GAATTCGCTA 


GCTAGCGGGG 


GAATACATAC 


CCGCAGGCGT 


AGAGACAACA 


50 


TTACAGCCCC 


CATAGGAGGT 


ATAACAAAAT 


TAATAGGAGA 


GAAAAACACA 


100 


TAAACACCTG 


AAAAACCCTC 


CTGCCTAGGC 


AAAATAGCAC 


CCTCCCGCTC 


150 


CAGAACAACA 


TACAGCGCTT 


CACAGCGGCA 


GCCTAACAGT 


CAGCCTTACC 


200 


AGTAAAAAAG 


AAAACCTATT 


AAAAAAACAC 


CACTCGACAC 


GGCACCAGCT 


250 


CAATCAGTCA 


CAGTGTAAAA 


AAGGGCCAAG 


TGCAGAGCGA 


GTATATATAG 


300 


GACTAAAAAA 


TGACGTAACG 


GTTAAAGTCC 


ACAAAAAACA 


CCCAGAAAAC 


350 


CGCACGCGAA 


CCTACGCCCA 


GAAACGAAAG 


CCAAAAAACC 


CACAACTTCC 


400 


TCAAATCGTC 


ACTTCCGTTT 


TCCCACGTTA 


CGTAACTTCC 


CATTTTAAGA 


450 


AAACTACAAT 


TCCCAACACA 


TACAAGTTAC 


TCCGCCCTAA 


AACCTACGTC 


500 


ACCCGCCCCG 


TTCCCACGCC 


CCGCGCCACG 


TCACAAACTC 


CACCCCCTCA 


550 


TTATCATATT 


GGCTTCAATC 


CAAAATAAGG 


TATATTATTG 


ATGATGCTAG 


600 


CATCATCAAT 


AATATACCTT 


ATTTTGGATT 


GAAGCCAATA 


TGATAATGAG 


650 


GGGGTGGAGT 


TTGTGACGTG 


GCGCGGGGCG 


TGGGAACGGG 


GCGGGTGACG 


700 


TAGTAGTGTG 


GCGGAAGTGT 


GATGTTGCAA 


GTGTGGCGGA 


ACACATGTAA 


750 


GCGACGGATG 


TGGCAAAAGT 


GACGTTTTTG 


GTGTGCGCCG 


GTGTACACAG 


800 


GAAGTGACAA 


TTTTCGCGCG 


GTTTTAGGCG 


GATGTTGTAG 


TAAATTTGGG 


850 


CGTAACCGAG 


TAAGATTTGG 


CCATTTTCGC 


GGGAAAACTG 


AATAAGAGGA 


900 


AGTGAAATCT 


GAATAATTTT 


GTGTTACTCA 


TAGCGCGTAA 


TATTTGTCTA 


950 


GGGAGATCAG 


CCTGCAGGTC 


GTTACATAAC 


TTACGGTAAA 


TGGCCCGCCT 


1000 


GGCTGACCGC 


CCAACGACCC 


CCGCCCATTG 


ACGTCAATAA 


TGACGTATGT 


1050 


TCCCATAGTA 


ACGCCAATAG 


GGACTTTCCA 


TTGACGTCAA 


TGGGTGGAGT 


1100 


ATTTACGGTA 


AACTGCCCAC 


TTGGCAGTAC 


ATCAAGTGTA 


TCATATGCCA 


1150 


AGTACGCCCC 


CTATTGACGT 


CAATGACGGT 


AAATGGCCCG 


CCTGGCATTA 


1200 


TGCCCAGTAC 


ATGACCTTAT 


GGGACTTTCC 


TACTTGGCAG 


TACATCTACG 


1250 


TATTAGTCAT 


CGCTATTACC 


ATGGTGATGC 


GGTTTTGGCA 


GTACATCAAT 


1300 
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GGGCGTGGAT 


AGCGGTTTGA 


CTCACGGGGA 


TTTCCAAGTC 


TCCACCCCAT 


1350 


TGACGTCAAT 


GGGAGTTTGT 


TTTGGCACCA 


AAATCAACGG 


GACTTTCCAA 


1400 


AATGTCGTAA 


CAACTCCGCC 


CCATTGACGC 


AAATGGGCGG 


TAGGCGTGTA 


1450 


C6GTGGGAG6 


TCTATATAAG 


CAGAGCTCGT 


TTAGTGAACC 


GTCAGATCGC 


1500 


CTGGAGACGC 


CATCCACGCT 


GTTTTGACCT 


CCATAGAAGA 


CACCGGGACC 


1550 


GATCCAGCCT 


CCGGACTCTA 


GAGGATCCGG 


TACTCGAGGA 


ACTGAAAAAC 


1600 


CAGAAAGTTA 


ACTGGTAAGT 


TTAGTCTTTT 


TGTCTTTTAT 


TTCAGGTCCC 


1650 


GGATCCGGTG 


GTGGTGCAAA 


TCAAAGAACT 


GCTCCTCAGT 


GGATGTTGCC 


1700 


TTTACTTCTA 


GGCCTGTACG 


GAAGTGTTAC 


TTCTGCTCTA 


AAAGCTGCGG 


1750 


AATTGTACCC 


GCGGCCGCAA 


TTCCCGGGGA 


TCGAAAGAGC 


CTGCTAAAGC 


1800 


AAAAAAGAAG 


TCACCATGTC 


GTTTACTTTG 


ACCAACAAGA 


ACGTGATTTT 


1850 


CGTTGCCGGT 


CTGGGAGGCA 


TTGGTCTGGA 


CACCAGCAAG 


GAGCTGCTCA 


1900 


AGCGCGATCC 


CGTCGTTTTA 


CAACGTCGTG 


ACTGGGAAAA 


CCCTGGCGTT 


1950 


ACCCAACTTA 


ATCGCCTTGC 


AGCACATCCC 


CCTTTCGCCA 


GCTGGCGTAA 


2000 


TAGCGAAGAG 


GCCCGCACCG 


ATCGCCCTTC 


CCAACAGTTG 


CGCAGCCTGA 


2050 


ATGGCGAATG 


GCGCTTTGCC 


TGGTTTCCGG 


CACCAGAAGC 


GGTGCCGGAA 


2100 


AGCTGGCTGG 


AGTGCGATCT 


TCCTGAGGCC 


GATACTGTCG 


TCGTCCCCTC 


2150 


AAACTGGCAG 


ATGCACGGTT 


ACGATGCGCC 


CATCTACACC 


AACGTAACCT 


2200 


ATCCCATTAC 


GGTCAATCCG 


CCGTTTGTTC 


CCACGGAGAA 


TCCGACGGGT 


2250 


TGTTACTCGC 


TCACATTTAA 


TGTTGATGAA 


AGCTGGCTAC 


AGGAAGGCCA 


2300 




AX XXX X WAX v 






CTCTGGTGCA 


2350 


ACGGGCGCTG 


GGTCGGTTAC 


GGCCAGGACA 


GTCGTTTGCC 


GTCTGAATTT 


2400 


GACCTGAGCG 


CATTTTTACG 


CGCCGGAGAA 


AACCGCCTCG 


CGGTGATGGT 


2450 


GCTGCGTTGG 


AGTGACGGCA 


GTTATCTGGA 


AGATCAGGAT 


ATGTGGCGGA 


2500 


TGAGCGGCAT 


TTTCCGTGAC 


GTCTCGTTGC 


TGCATAAACC 


GACTACACAA 


2550 


ATCAGCGATT 


TCCATGTTGC 


CACTCGCTTT 


AATGATGATT 


TCAGCCGCGC 


2600 
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T6TACTGGAG 


GCTGAAGTTC 


AGATGTGCGG 


CGAGTTGCGT 


GACTACCTAC 


2650 


GGGTAACAGT 


TTCTTTATGG 


CAGGGTGAAA 


CGCAGGTCGC 


CAGCGGCACC 


2700 


GCGCCTTTCG 


GCGGTGAAAT 


TATCGATGAG 


CGTGGTGGTT 


ATGCCGATCG 


2750 


CGTCACACTA 


CGTCTGAACG 


TCGAAAACCC 


GAAACTGTGG 


AGCGCCGAAA 


2800 


TCCCGAATCT 


CTATCGTGCG 


GTGGTTGAAC 


TGCACACCGC 


CGACGGCACG 


2850 


CTGATTGAAG 


CAGAAGCCTG 


CGATGTCGGT 


TTCCGCGAGG 


TGCGGATTGA 


2900 


AAATGGTCTG 


CTGCTGCTGA 


ACGGCAAGCC 


GTTGCTGATT 


CGAGGCGTTA 


2950 


ACCGTCACGA 


GCATCATCCT 


CTGCATGGTC 


AGGTCATGGA 


TGAGCAGACC 


3000 


ATGGTGCAGG 


ATATCCTGCT 


GATGAAGCAG 


AACAACTTTA 


ACGCCGTGCG 


3050 


CTGTTCGCAT 


TATCCGAACC 


ATCCGCTGTG 


GTACACGCTG 


TGCGACCGCT 


3100 


ACGGCCTGTA 


TGTGGTGGAT 


GAAGCCAATA 


TTGAAACCCA 


CGGCATGGTG 


3150 


CCAATGAATC 


GTCTGACCGA 


TGATCCGCGC 


TGGCTACCGG 


CGATGAGCGA 


3200 


ACGCGTAACG 


CGAATGGTGC 


AGCGCGATCG 


TAATCACCCG 


AGTGTGATCA 


3250 


TCTGCTCGCT 


GGGGAATGAA 


TCAGGCCACG 


GCGCTAATCA 


CGACGCGCTG 


3300 


TATCGCTGGA 


TCAAATCTGT 


CGATCCTTCC 


CGCCCGGTGC 


AGTATGAAGG 


3350 


CGGCGGAGCC 


GACACCACGG 


CCACCGATAT 


TATTTGCCCG 


ATGTACGCGC 


3400 


GCGTGGATGA 


AGACCAGCCC 


TTCCCGGCTG 


TGCCGAAATG 


GTCCATCAAA 


3450 


AAATGGCTTT 


CGCTACCTGG 


AGAGACGCGC 


CCGCTGATCC 


TTTGCGAATA 


3500 


CGCCCACGCG 


ATGGGTAACA 


GTCTTGGCGG 


TTTCGCTAAA 


TACTGGCAGG 


3550 


CGTTTCGTCA 


GTATCCCCGT 


TTACAGGGCG 


GCTTCGTCTG 


GGACTGGGTG 


3600 


un X WAV? X UvjW 


TGATTAAATAN tgatgaaaac 


GGCAACCCGT 


GGTCGGCTTA 


3650 


CGGCGGTGAT 


TTTGGCGATA 


CGCCGAACGA 


TCGCCAGTTC 


TGTATGAACG 


3700 


GTCTGGTCTT 


TGCCGACCGC 


ACGCCGCATC 


CAGCGCTGAC 


G6AAGCAAAA 


3750 


CACCAGCAGC 


AGTTTTTCCA 


GTTCCGTTTA 


TCCGGGCAAA 


CCATCGAAGT 


3800 


GACGAGCGAA 


TACCTGTTCC 


GTCATAGCGA 


TAACGAGCTC 


CTGCACTGGA 


3850 


TGGTGGCGCT 


GGATGGTAAG 


CCGCTGGCAA 


GCGGTGAAGT 


GCCTCTGGAT 


3900 
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GTCGCTCCAC 


AAGGTAAACA 


GTTGATTGAA 


CTGCCTGAAC 


TACCGCAGCC 


3950 


GGAGA6CGCC 


GGGCAACTCT 


GGCTCACAGT 


ACGC6TA6T6 


CAACCGAACG 


4000 


CGACCGCATG 


GTCAGAAGCC 


GGGCACATCA 


GCGCCTGGCA 


GCAGTGGCGT 


4050 


CTGGCGGAAA 


ACCTCAGTGT 


GACGCTCCCC 


GCGCGTCCC 


ACGCCATCCC 


4100 


GCATCTGACC 


ACCAGCGAAA 


TGGATTTTTG 


CATCGAGCTG 


GGTAATAAGC 


4150 


GTTGGCAATT 


TAACCGCCAG 


TCAGGCTTTC 


TTTCACAGAT 


GTGGATTGGC 


4200 


GATAAAAAAC 


AACTGCTGAC 


GCCGCTGCGC 


GATCAGTTCA 


CCCGTGCACC 


4250 


GCTGGATAAC 


GACATTGGCG 


TAAGTGAAGC 


GACCCGCATT 


GACCCTAACG 


4300 


CCTGGGTCGA 


ACGCTGGAAG 


GCGGCGGGCC 


ATTACCAGGC 


CGAAGCAGCG 


4350 


TTGTTGCAGT 


GCACGGCAGA 


TACACTTGCT 


GATGCGGTGC 


TGATTACGAC 


4400 


CGCTGACGCG 


TGGCAGCATC 


AGGGGAAAAC 


CTTATTTATC 


AGCCGGAAAA 


4450 


CCTACCGGAT 


TGATGGTAGT 


GGTCAAATGG 


CGATTACCGT 


TGATGTTGAA 


4500 


GTGGCGAGCG 


ATACACCGCA 


TCCGGCGCGG 


ATTGGCCTGA 


ACTGCCAGCT 


4550 


GGCGGAGGTA 


GCAGAGCGGG 


TAAACTGGCT 


CGGATTAGGG 


CCGCAAGAAA 


4600 


ACTATCCCGA 


CCGCCTTACT 


GCCGCCTGTT 


TTGACCGCTG 


GGATCTGCCA 


4650 


TTGTCAGACA 


TGTATACCCC 


GTACGTCTTC 


CCGAGCGAAA 


ACGGTCTGCG 


4700 


CTGCGGGACG 


CGCGAATTGA 


ATTATGGCCC 


ACACCAGTGG 


CGCGGCGACT 


4750 


TCCAGTTCAA 


CATCAGCCGC 


TACAGTCAAC 


AGCAACTGAT 


GGAAACCAGC 


4800 


CATCGCCATC 


TGCTGCACGC 


GGAAGAAGGC 


ACATGGCTGA 


ATATCGACGG 


4850 


TTTCCATATG 


GGGATTGGTG 


GCGACGACTC 


CTGGAGCCCG 


TCAGTATCGG 


4900 


CGGAATTACA 


GCTGAGCGCC 


GGTCGCTACC 


ATTACCAGTT 


GGTCTGGTGT 


4950 


CAAAAATAAT 


AATAACCGGG 


CAGGCCATGT 


CTGCCCGTAT 


TTCGCGTAAG 


5000 


GAAATCCATT 


ATGTACTATT 


TAAAAAACAC 


AAACTTTTGG 


ATGTTCGGTT 


5050 


TATTCTTTTT 


CTTTTACTTT 


TTTATCATGG 


GAGCCTACTT 


CCCGTTTTTC 


5100 


CCGATTTGGC 


TACATGACAT 


CAACCATATC 


AGCAAAAGTG 


ATACGGGTAT 


5150 


TATTTTTGCC 


GCTATTTCTC 


TGTTCTCGCT 


ATTATTCCAA 


CCGCTGTTTG 


5200 


GTCTGCTTTC 


TGACAAACTC 


6GCCTCGACT 


CTAGGCGGCC 


GCGGGGATCC 


5250 
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AGACATGATA 


AGATACATTG 


ATGAGTTTGG 


ACAAACCACA 


ACTAGAATGC 


5300 


AGTGAAAAAA 


ATGCTTTATT 


TGTGAAATTT 


GTGATGCTAT 


TGCTTTATTT 


5350 


GTAACCATTA 


TAAGCTGCAA 


TAAACAAGTT 


AACAACAACA 


ATTGCATTCA 


5400 


TTTTATGTTT 


CAGGTTCAGG 


GGGAGGTGTG 


GGAGGTTTTT 


TCGGATCCTC 


5450 


TAGAGTCGAC 


GACGCGAGGC 


TGGATGGCCT 


TCCCCATTAT 


GATTCTTCTC 


5500 


GCTTCCGGCG 


GCATCGGGAT 


6CCCGCGTTG 


CAGGCCATGC 


TGTCCAGGCA 


5550 


GGTAGATGAC 


GACCATCAGG 


GACAGCTTCA 


AGGATCGCTC 


GCGGCTCTTA 


5600 


CCAGCCTAAC 


TTCGATCACT 


GGACCGCTGA 


TCGTCACGGC 


GATTTATGCC 


5650 


GCCTCGGCGA 


GCACATGGAA 


CGGGTTGGCA 


TGGATTGTAG 


GCGCCGCCCT 


5700 


ATACCTTGTC 


TGCCTCCCCG 


CGTTGCGTCG 


CGGTGCATGG 


AGCCGGGCCA 


5750 


CCTCGACCTG 


AATGGAAGCC 


GGCGGCACCT 


CGCTAACGGA 


TTCACCACTC 


5800 


CAAGAATTGG 


AGCCAATCAA 


TTCTTGCGGA 


GAACTGTGAA 


TGCGCAAACC 


5850 


AACCCTTGGC 


AGAACATATC 


CATCGCGTCC 


GCCATCTCCA 


GCAGCCGCAC 


5900 


GCGGCGCATC 


TCGGGCAGCG 


TTGGGTCCTG 


GCCACGGGTG 


CGCATGATCG 


5950 


TGCTCCTGTC 


GTTGAGGACC 


CGGCTAGGCT 


GGCGGGGTTG 


CCTTACTGGT 


6000 


TAGCAGAATG 


AATCACCGAT 


ACGCGAGCGA 


ACGTGAAGCG 


ACTGCTGCTG 


6050 


CAAAACGTCT 


GCGACCTGAG 


CAACAACATG 


AATGGTCTTC 


GGTTTCCGTG 


6100 


TTTCGTAAAG 


TCTGGAAACG 


CGGAAGTCAG 


CGCCCTGCAC 


CATTATGTTC 


6150 


CGGATCTGCA 


TCGCAGGATG 


CTGCTGGCTA 


CCCTGTGGAA 


CACCTACATC 


6200 


TGTATTAACG 


AAGCCTTTCT 


CAATGCTCAC 


GCTGTAGGTA 


TCTCAGTTCG 


6250 


GTGTAGGTCG 


TTCGCTCCAA 


GCTGGGCTGT 


GTGCACGAAC 


CCCCCGTTCA 


6300 


GCCCGACCGC 


TGCGCCTTAT 


CCGGTAACTA 


TCGTCTTGAG 


TCCAACCCGG 


6350 


TAAGACACGA 


CTTATCGCCA 


CTGGCAGCAG 


CCACTGGTAA 


CAGGATTAGC 


6400 


AGAGCGAGGT 


ATGTAGGCGG 


TGCTACAGAG 


TTCTTGAAGT 


GGTGGCCTAA 


6450 


CTACGGCTAC 


ACTAGAAGGA 


CAGTATTTGG 


TATCTGCGCT 


CTGCTGAAGC 


6500 


CAGTTACCTT 


CGGAAAAAGA 


GTTGGTAGCT 


CTTGATCCGG 


CAAACAAACC 


6550 
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ACCGCTGGTA 


GCGGTGGTTT 


TTTTGTTTGC 


AAGCAGCAGA 


TTACGCGCAG 


6600 


AAAAAAAGGA 


TCTCAAGAAG 


ATCCTTTGAT 


CTTTTCTACG 


GGGTCTGACG 


6650 


CTCAGTGGAA 


CGAAAACTCA 


CGTTAAGGGA 


TTTTGGTCAT 


GAGATTATCA 


6700 


AAAAGGATCT 


TCACCTAGAT 


CCTTTTAAAT 


TA\AAATGAA 


GTTTTAAATC 


6750 


AATCTAAAGT 


ATATATGAGT 


AAACTTGGTC 


TGACAGTTAC 


CAATGCTTAA 


6800 


TCAGTGAGGC 


ACCTATCTCA 


GCGATCTGTC 


TATTTCGTTC 


ATCCATAGTT 


6850 


GCCTGACTCC 


CCGTCGTGTA 


GATAACTACG 


ATACGGGAGG 


GCTTACCATC 


6900 


TGGCCCCAGT 


GCTGCAATGA 


TACCGCGAGA 


CCCACGCTCA 


CCGGCTCGAG 


6950 


ATTTATCAGC 


AATAAACCAG 


CCAGCCGGAA 


GGGCCGAGCG 


CAGAAGTGGT 


7000 


CCTGCAACTT 


TATCCGCCTC 


CATCCAGTCT 


ATTAATTGTT 


GCCGGGAAGC 


7050 


TAGAGTAAGT 


AGTTCGCCAG 


TTAATAGTTT 


GCGCAACGTT 


GTTGCCATTG 


7100 


CTGCAGGCAT 


CGTGGTGTCA 


CGCTCGTCGT 


TTGGTATGGC 


TTCATTCAGC 


7150 


TCCGGTTCCC 


AACGATCAAG 


GCGAGTTACA 


TCATCCCCCA 


TGTTGTGCAA 


7200 


AAAAGCGGTT 


AGCTCCTTCG 


GTCCTCCGAT 


CGTTGTCAGA 


AGTAAGTTGG 


7250 


CCGCAGTGTT 


ATCACTCATG 


GTTATGCCAG 


CACTGCATAA 


TTCTCTTACT 


7300 


GTCATGCCAT 


CCGTAAGATG 


CTTTTCTGTG 


ACTGGTGAGT 


ACTCAACCAA 


7350 


GTCATTCTGA 


GAATAGTGTA 


TGCGGCGACC 


GAGTTGCTCT 


TGCCCGGCGT 


7400 


CAACACGGGA 


TAATACCGCG 


CCACATAGCA 


CAACTTTAAA 


AGTGCTCATC 


7450 


ATTGGAAAAC 


GTTCTTCGGG 


GCGAAAACTC 


TCAAGGATCT 


TACCGCTGTT 


7500 


GAGATCCAGT 


TCGATGTAAC 


CCACTCGTGC 


ACCCAACTGA 


TCTTCAGCAT 


7550 


\*± X X 1AL1 1 X 


UiLLAbLbl X 


TUTCjGCjI gag 


CAAAAACAGG 


AAG G CAAAA1 


/ ouu 


GCCGCAAAAA 


AGGGAATAAG 


GGCGACACGG 


AAATGTTGAA 


TACTCATACT 


7650 


CTTCCTTTTT 


CAATATTATT 


GAAGCATTTA 


TCAGGGTTAT 


TGTCTCATGA 


7700 


GCGGATACAT 


ATTTGAATGT 


ATTTAGAAAA 


ATAAACAAAT 


AGGGGTTCCG 


7750 


CGCACATTTC 


CCCGAAAAGT 


GCCACCTGAC 


GTCTAAGAAA 


CCATTATTAT 


7800 


CATGACATTA 


ACCTATAAAA 


ATAGGCGTAT 


CACGAGGCCC 


TTTCGTCTTC 


7850 


AA 










7852 
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TCTTCCGCTT 


CCTCGCTCAC 


TGACTCGCTG 


CGCTCGGTCG 


TTCGGCTGCG 


50 


GC6AGCGGTA 


TCAGCTCACT 


CAAAGGCGGT 


AATACGGTTA 


TCCACAGAAT 


100 


CAGGGGATAA 


CGCAGGAAAG 


AACATGTGAG 


CAAAAGGCCA 


GGAAAAGGCC 


150 


AGGAACCGTA 


AAAAGGCCGC 


GTTGCTGGCG 


TTTTTCCATA 


GGCTCCGCCC 


200 


CCCTGACGAG 


CATCACAAAA 


ATCGACGCTC 


AAGTCAGAGG 


TGGCGAAACC 


250 


CGACAGGACT 


ATAAAGATAC 


CAGGCGTTTC 


CCCCTGGAAG 


CTCCCTCGTG 


300 


CGCTCTCCTG 


TTCCGACCCT 


GCCGCTTACC 


GGATACCTGT 


CCGCCTTTCT 


350 


CCCTTCGGGA 


AGCGTGGCGC 


TTTCTCATAG 


CTCACGCTGT 


AGGTATCTCA 


400 


GTTCGGTGTA 


GGTCGTTCGC 


TCCAAGCTGG 


GCTGTGTGCA 


CGAACCCCCC 


450 


GTTCAGCCCG 


ACCGCTGCGC 


CTTATCCGGT 


AACTATCGTC 


TTGAGTCCAA 


500 


CCCGGTAAGA 


CACGACTTAT 


CGCCACTGGC 


AGCAGCCACT 


GGTAACAGGA 


550 


TTAGCAGAGC 


GAGGTATGTA 


GGCGGTGCTA 


CAGAGTTCTT 


GAAGTGGTGG 


600 


CCTAACTACG 


GCTACACTAG 


AAGAAGAGTA 


TTTGGTATCT 


GCGCTCTGCT 


650 


GAAGCCAGTT 


ACCTTCGGAA 


AAAGAGTTGG 


TAGCTCTTGA 


TCCGGCAAAC 


700 


AAACCACCGC 


TGGTAGCGGT 


GGTTTTTTTG 


TTTGCAAGCA 


GCAGATTACG 


750 


CGCAGAAAAA 


AAGGATCTCA 


AGAAGATCCT 


TTGATCTTTT 


CTACGGGGTC 


800 


TGACGCTCAG 


TGGAACGAAA 


ACTCACGTTA 


AGGGATTTTG 


GTCATGAGAT 


850 


TATCAAAAAG 


GATCTTCACC 


TAGATCCTTT 


TAAATTAAAA 


ATGAAGTTTT 


900 


AAATGAATCT 


AAAGTATATA 


TGAGTAAACT 


TGGTCTGACA 


GTTACCAATG 


950 


CTTAATCAGT 


GAGGCACCTA 


TCTCAGCGAT 


CTGTCTATTT 


CGTTCATCCA 


1000 






uiulnuAlAA 


\* JL AwUA X /» V* w 


GGAGGGCTTA 


1050 


CCATCTGGCC 


CCAGTGCTGC 


AATGATACCG 


CCAGACCCAC 


GCTCACCGGC 


1100 


TCCAGATTTA 


TCAGCAATAA 


ACCAGCCAGC 


CGGAAGGGCC 


GAGCGCAGAA 


1150 


GTGGTCCTGC 


AACTTTATCC 


GCCTCCATCC 


AGTCTATTAA 


TTGTTGCCGG 


1200 


GAAGCTAGAG 


TAAGTAGTTC 


GCCAGTTAAT 


AGTTTGCGCA 


ACGTTGTTGC 


1250 


CATTGCTACA 


GGCATCGTGG 


TGTCACGCTC 


GTCGTTTGGT 


ATGGCTTCAT 


1300 
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TCAGCTCCGC 


TTCCCAACGA 


TCAAGGCGAG 


TTACATGATC 


CCCCATGTTG 


1350 


TGCAAAAAAG 


CGGTTAGCTC 


CTTCGGTCCT 


CCGATCGTTG 


TCAGAAGTAA 


1400 


GTTGGCCGCA 


GTGTTATCAC 


TCATGGTTAT 


GGCAGCACTG 


CATAATTCTC 


1450 


TTACTGTCAT 


GCCATCCGTA 


AGATGCTTTT 


CTGTGACTGG 


TGAGTACTCA 


1500 


ACCAAGTCAT 


TCTGAGAATA 


GTGTATGCGG 


CGACCGAGTT 


GCTCTTGCCC 


1550 


GGCGTCAATA 


CGGGATAATA 


CCGCGCCACA 


TAGCAGAACT 


TTAAAAGTGC 


1600 


TCATCATTGG 


AAAACGTTCT 


TCGGGGCGAA 


AACTCTCAAG 


GATCTTACCG 


1650 


CTGTTGAGAT 


CCAGTTCGAT 


GTAACCCACT 


CGTGCACCCA 


ACTGATCTTC 


1700 


AGCATCTTTT 


ACTTTCACCA 


GCGTTTCTGG 


GTGAGCAAAA 


ACAGGAAGGC 


1750 


AAAATGCCGC 


AAAAAAGGGA 


ATAAGGGCGA 


CACGGAAATG 


TTGAATACTC 


1800 


ATACTCTTCC 


TTTTTCAATA 


TTATTGAAGC 


ATTTATCAGG 


GTTATTGTCT 


1850 


CATGAGCGGA 


TACATATTTG 


AATGTATTTA 


GAAAAATAAA 


CAAATAGGGG 


1900 


TTCCGCGCAC 


ATTTCCCCGA 


AAAGTGCCAC 


CTGACGTCTA 


AGAAACCATT 


1950 


ATTATCATGA 


CATTAACCTA 


TAAAAATAGG 


CGTATCACGA 


GGCCCTTTOG 


2000 


TCTCGCGCGT 


TTCGGTGATG 


ACGGTGAAAA 


CCTCTGACAC 


ATGCAGCTCC 


2050 


CGGAGACGGT 


CACAGCTTGT 


CTGTAAGCGG 


ATGCCGGGAG 


CAGACAAGCC 


2100 


CGTCAGGGCG 


CGTCAGCGGG 


TGTTGGCGGG 


TGTCGGGGCT 


GGCTTAACTA 


2150 


TGCGGCATCA 


GAGCAGATTG 


TACTGAGAGT 


GCACCATAAA 


ATTGTAAACG 


2200 


TTAATATTTT 


GTTAAAATTC 


GCGTTAAATT 


TTTGTTAAAT 


CAGCTCATTT 


2250 


TTTAACCAAT 


AGGCCGAAAT 


CGGCAAAATC 


CCTTATAAAT 


CAAAAGAATA 


2300 


GCCCGAGATA 


GGGTTGAGTG 


TTGTTCCAGT 


TTGGAACAAG 


AGTCCACTAT 


2350 


TAAAGAACGT 


GGACTCCAAC 


GTCAAAGGGC 


GAAAAACCGT 


CTATCAGGGC 


2400 


GATGGCCCAC 


TACGTGAACC 


ATCACCCAAA 


TCAAGTTTTT 


TGGGGTCGAG 


2450 


GTGCCGTAAA 


GCACTAAATC 


GGAACCCTAA 


AGGGAGCCCC 


CGATTTAGAG 


2500 


CTTGACGGGG 


AAAGCCGGCG 


AACGTGGCGA 


GAAAGGAAGG 


GAAGAAAGCG 


2550 


AAAGGAGCGG 


GCGCTAGGGC 


GCTGGCAAGT 


GTAGCGGTCA 


CGCTGCGCGT 


2600 
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AACCACCACA 


CCCGCCGCGC 


TTAATGCGCC 


GCTACAGGGC 


GCGTACTATG 


2650 


GTTGCTTTGA 


CGTATGCGGT 


GTGAAATACC 


GCACAGATGC 


GTAAGGAGAA 


2700 


AATACCGCAT 


CAGGCGCCAT 


TCGCCATTCA 


GGCTGCGCAA 


CTGTTGGGAA 


2750 


GGGCGATCGG 


TGCGGGCCTC 


TTCGCTATTA 


CGCCAGCTGG 


CGAAAGGGGG 


2800 


ATGTGCTGCA 


AGGCGATTAA 


GTTGGGTAAC 


GCCAGGGTTT 


TCCCAGTCAC 


2850 


GACGTTGTAA 


AACGACGGCC 


AGTGCCAAGC 


TTAAGGTGCA 


CGGCCCACGT 


2900 


GGCCACTAGT 


ACTTCTCGAG 


CTCTGTACAT 


GTCCGCGGTC 


GCGACGTACG 


2950 


CGTATCGATG 


GCGCGAGCTG 


CAGGCGGCCG 


CCATATGCAT 


CCTAGGCCTA 


3000 


TTAATATTCC 


GGAGTATACG 


TAGCCGGCTA 


ACGTTAACAA 


CCGGTACCTC 


3050 


TAGAACTATA 


GCTAGCCAAT 


TCCATCATCA 


ATAATATACC 


TTATTTTGGA 


3100 


TTGAAGCCAA 


TATGATAATG 


AGGGGGTGGA 


GTTTGTGACG 


TGGCGCGGGG 


3150 


CGTGGGAACG 


GGGCGGGTGA 


CGTAGGTTTT 


AGGGCGGAGT 


AACTTGTATG 


3200 


TGTTGGGAAT 


TGTAGTTTTC 


TTAAAATGGG 


AAGTTACGTA 


ACGTGGGAAA 


3250 


ACGGAAGTGA 


CGATTTGAGG 


AAGTTGTGGG 


TTTTTTGGCT 


TTCGTTTCTC 


3300 


GGCGTAGGTT 


CGCGTGCGGT 


TTTCTGGGTG 


TTTTTTGTGG 


ACTTTAACCG 


3350 


TTACGTCATT 


TTTTAGTCCT 


ATATATACTC 


GCTCTGCACT 


TGGCCCTTTT 


3400 


TTACACTGTG 


ACTGATTGAG 


CTGGTGCCGT 


GTCGAGTGGT 


GTTTTTTTAA 


3450 


TAGGTTTTCT 


TTTTTACTGG 


TAAGGCTGAC 


TGTTAGGCTG 


CCGCTGTGAA 


3500 


GCGCTGTATG 


TTGTTCTGGA 


GCGGGAGGGT 


GCTATTTTGC 


CTAGGCAGGA 


3550 


GGGTTTTTCA 


GGTGTTTATG 


TGTTTTTCTC 


TCCTATTAAT 


TTTGTTATAC 


3600 


CTCCTATGGG 


GGwaGTAATC* 


X iAjiwl L-lAw 


uLt iAawv»L»l» i 


ni ul/il X www 


3650 


CCCAAGCTTG 


CATGCCTGCA 


GGTCGACTCT 


AGAGGATCCG 


AAAAAACCTC 


3700 


CCACACCTCC 


CCCTGAACCT 


GAAACATAAA 


ATGAATGCAA 


TTGTTGTTGT 


3750 


TAACTTGTTT 


ATTGCAGCTT 


ATAATGGTTA 


CAAATAAAGC 


AATAGCATCA 


3800 


CAAATTTCAC 


AAATAAAGCA 


ITlTlTrCAC 


TGGATTCTAG 


TTGTGGTTTG 


3850 


TCCAAACTCA 


TCAATGTATC 


TTATCATGTC 


TGGATCCCCC 


TAGCTTGCCA 


3900 
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AACCTACAGG 


TGGGGTCTTT 


CATTCCCCCC 


TTTTTCTGGA 


GACTAAATAA 


3950 


AATCTTTTAT 


TTTATCTATG 


GCTCGTACTC 


TATAGGCTTC 


AGCTGGTGAT 


4000 


ATTGTTGAGT 


CAAAACTAGA 


GCCTGGACCA 


CTGATATCCT 


GTCTTTAACA 


4050 


AATTGGACTA 


ATCGCGGGAT 


CAGCCAATTC 


CATGAGCAAA 


TGTCCCATGT 


4100 


CAACATTTAT 


GCTGCTCTCT 


AAAGCCTTGT 


ATCTTGCATC 


TCTTCTTCTG 


4150 


TCTCCTCTTT 


CAGAGCAGCA 


ATCTGGGGCT 


TAGACTTGCA 


CTTGCTTGAG 


4200 


TTCCGGTGGG 


GAAAGAGCTT 


CACCCTGTCG 


GAGGGGCTGA 


TGGCTTGCCG 


4250 


GAAGAGGCTC 


CTCTCGTTCA 


GCAGTTTCTG 


GATGGAATCG 


TACTGCCGCA 


4300 


CTTTGTTCTC 


TTCTATGACC 


AAAAATTGTT 


GGCATTCCAG 


CATTGCTTCT 


4350 


ATCCTGTGTT 


CACAGAGAAT 


TACTGTGCAA 


TCAGCAAATG 


CTTGTTTTAG 


4400 


AGTTCTTCTA 


ATTATTTGGT 


ATGTTACTGG 


ATCCAAATGA 


GCACTGGGTT 


4450 


CATCAAGCAG 


CAAGATCTTC 


GCCTTACTGA 


GAACAGATCT 


AGCCAAGCAC 


4500 


ATCAACTGCT 


TGTGGCCATG 


GCTTAGGACA 


CAGCCCCCAT 


CCACAAGGAC 


4550 


AAAGTCAAGC 


TTCCCAGGAA 


ACTGTTCTAT 


CACAGATCTG 


AGCCCAACCT 


4600 


CATCTGCAAC 


TTTCCATATT 


TCTTGATCAC 


TCCACTGTTC 


ATAGGGATCC 


4650 


AAGTTTTTTC 


TAAATGTTCC 


AGAAAAAATA 


AATACTTTCT 


GTGGTATCAC 


4700 


TCCAAAGGCT 


TTCCTCCACT 


GTTGCAAAGT 


TATTGAATCC 


CAAGACACAC 


4750 


CATCGATCTG 


GATTTCTCCT 


TCAGTGTTCA 


GTAGTCTCAA 


AAAAGCTGAT 


4800 


AACAAAGTAC 


TCTTCCCTGA 


TCCAGTTCTT 


CCCAAGAGGC 


CCACCCTCTG 


4850 


GCCAGGACTT 


ATTGAGAAGG 


AAATGTTCTC 


TAATATGGCA 


TTTCCACCTT 


4900 


CTGTGTATTT 


TG GTG 1 G AG A 




1 wAl X X uuuL 




4950 


CAGATGTCAT 


CTTTCTTCAC 


GTGTGAATTC 


TCAATAATCA 


TAACTTTC6A 


5000 


GAGTTGGCCA 


TTCTTGTATG 


GTTTGGTTGA 


CTTGGTAGGT 


TTACCTTCTG 


5050 


TTGGCATGTC 


AATGAACTTA 


AAGACTCGGC 


TCACAGATCG 


CATCAAGCTA 


5100 


TCCACATCTA 


TGCTGGAGTT 


TACAGCCCAC 


TGCAATGTAC 


TCATGATATT 


5150 


CATGGCTAAA 


GTCAGGATAA 


TACCAACTCT 


TCCTTCTCCT 


TCTCCTGTTG 


5200 
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TTAAAATGGA 


AATGAAGGTA 


ACAGCAATGA 


AGAAGATGAC 


AAAAATCATT 


5250 


TCTATTCTCA 


TTTGGAACCA 


GCGCAGTGTT 


GACAGGTACA 


AGAACCAGTT 


5300 


GGCAGTATGT 


AAATTCAGAG 


CTTTGTGGAA 


CAGAGTTTCA 


AAGTAAGGCT 


5350 


GCCGTCCGAA 


GGCACGAAGT 


GTCCATAGTC 


CTTTTAAGCT 


TGTAACAAGA 


5400 


TGAGTGAAAA 


TTGGACTCCT 


GCCTTCAGAT 


TCCAGTTGTT 


TGAGTTGCTG 


5450 


TGAGGTTTGG 


AGGAAATATG 


CTCTCAACAT 


AATAAAAGCC 


ACTATCACTG 


5500 


GCACTGTTGC 


AACAAAGATG 


TAGGGTTGTA 


AAACTGCGAC 


AACTGCTATA 


5550 


GCTCCAATCA 


CAATTAATAA 


CAACTGGATG 


AAGTCAAATA 


TGGTAAGAGG 


5600 


CAGAAGGTCA 


TCCAAAATTG 


CTATATCTTT 


GGAGAATCTA 


TTAAGAATCC 


5650 


CACCTGCTTT 


CAACGTGTTG 


AGGGTTGACA 


TAGGTGCTTG 


AAGAACAGAA 


5700 


TGTAACATTT 


TGTGGTGTAA 


AATTTTCGAC 


ACTGTGATTA 


GAGTATGCAC 


5750 


CAGTGGTAGA 


CCTCTGAAGA 


ATCCCATAGC 


AAGCAAAGTG 


TCGGCTACTC 


5800 


CCACGTAAAT 


GTAAAACAGA 


TAATACGAAC 


TGGTGCTGGT 


GATAATCACT 


5850 


GCATAGCTGT 


TATTTCTACT 


ATGAGTACTA 


TTCCCTTTGT 


CTTGAAGAGG 


5900 


AGTGTTTCCA 


AGGAGCCACA 


GCACAACCAA 


AGAAGCAGCC 


ACCTCTGCCA 


5950 


GAAAAATTAC 


TAAGCACCAA 


ATTAGCACAA 


AAATTAAGCT 


CTTGTGGACA 


6000 


GTAATATATC 


GAAGGTATGT 


GTTCCATGTA 


GTCACTGCTG 


GTATGCTCTC 


6050 


CATATCATCA 


AAAAAGCACT 


CCTTTAAGTC 


TTCTTCGTTA 


ATTTCTTCAC 


6100 


TTATTTCCAA 


GCCAGTTTCT 


TGAGATAACC 


TTCTTGAATA 


TATATCCAGT 


6150 


TCAGTCAAGT 


TTGCCTGAGG 


GGCCAGTGAC 


ACTTTTCGTG 


TGGATGCTGT 


6200 




X unnlu iivl 


wnw v* X X W X X 


AACTGAGTGT 


GTCATCAGGT 


6250 


TCAGGACAGA 


CTGCCTCCTT 


CGTGCCTGAA 


6CGTGGGGCC 


AGTGCTGATC 


6300 


ACGCTGATGC 


GAGGCAGTAT 


CGCCTCTCCC 


TGCTCAGAAT 


CTGGTACTAA 


6350 


GGACAGCCTT 


CTCTCTAAAG 


GCTCATCAGA 


ATCCTCTTCG 


ATGCCATTCA 


6400 


TTT6TAA6GG 


AGTCTTTTGC 


ACAATGGAAA 


ATTTTCGTAT 


AGAGTTGATT 


6450 


GGATTGAGAA 


TAGAATTCTT 


CCTTTTTTCC 


CCAAACTCTC 


CAGTCTGTTT 


6500 
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AAAAGATTGT 


TTTTTTGTTT 


CTGTCCAGGA 


GACAGGAGCA 


TCTCCTTCTA 


6550 


AT6AGAAAC6 


GTGTAAGGTC 


TCAGTTAGGA 


TTGAATTTCT 


TCTTTCTGCA 


6600 


CTAAATTGGT 


CGAAAGAATC 


ACATCCCATG 


AGTTTTGAGC 


TAAAGTCTGG 


6650 


CTGTAGATTT 


TGGAGTTCTG 


AAAATGTCCC 


ATAAAAATAG 


CTGCTACCTT 


6700 


CATGCAAAAT 


TAATATTTTG 


TCAGCTTTCT 


TTAAATGTTC 


CATTTTAGAA 


6750 


GTGACCAAAA 


TCCTAGTTTT 


GTTAGCCATC 


AGTTTACAGA 


CACAGCTTTC 


6800 


AAATATTTCT 


TTTTCTGTTA 


AAACATCTAG 


GTATCCAAAA 


GGAGAGTCTA 


6850 


ATAAATACAA 


ATCAGCATCT 


TTGTATACTG 


CTCTTGCTAA 


AGAAATTCTT 


6900 


GCTCGTTGAC 


CTCCACTCAG 


TGTGATTCCA 


CCTTCTCCAA 


GAACTATATT 


6950 


GTCTTTCTCT 


GCAAACTTGG 


AGATGTCCTC 


TTCTAGTTGG 


CATGCTTTGA 


7000 


TGACGCTTCT 


GTATCTATAT 


TCATCATAGG 


AAACACCAAA 


GATGATATTT 


7050 


TCTTTAATGG 


TGCCAGGCAT 


AATCCAGGAA 


AACTGAGAAC 


AGAATGAAAT 


7100 


TCTTCCACTG 


TGCTTAATTT 


TACCCTCTGA 


AGGCTCCAGT 


TCTCCCATAA 


7150 


TCATCATTAG 


AAGTGAAGTC 


TTGCCTGCTC 


CAGTGGATCC 


AGCAACCGCC 


7200 


AACAACTGTC 


CTCTTTCTAT 


CTTGAAATTA ATATCTTTCA 


GGACAGGAGT 


7250 


ACCAAGAAGT 


GAGAAATTAC 


TGAAGAAGAG 


GCTGTCATCA 


CCATTAGAAG 


7300 


TTTTTCTATT 


GTTATTGTTT 


TGTTTTGCTT 


TCTCAAATAA 


TTCCCCAAAT 


7350 


CCCTCCTCCC 


AGAAGGCTGT 


TACATTCTCC 


ATGACTACTT 


CTGTAGTCGT 


7400 


TAAGTTATAT 


TCCAATGTCT 


TATATTCTTG 


CTTTTGTAAG 


AAATCCTGTA 


7450 


TTTTGTTTAT 


TGCTCCAAGA 


GAGTCATACC 


ATGTTTGTAC 


AGCCCAGGGA 


7500 


AA X X wUAu 


TCACCGCCAT 

IwAvVVJvVnl 


GCGCAGAACA ATGCAGAATG AGATGGTGGT 


7550 


GAATATTTTC 


CGGAGGATGA 


TTCCTTTGAT 


TAGTGCATAG 


GGAAGCACAG 


7600 


ATAAAAACAC 


CACAAAGAAC 


CCTGAGAAGA 


AGAAGGCTGA 


GCTATTGAAG 


7650 


TATCTCACAT 


AGGCTGCCTT 


CCGAGTCAGT 


TTCAGTTCTG 


TTTGTCTTAA 


7700 


GTTTTCAATC 


ATTTTTTCCA 


TTGCTTCTTC 


CCAGCAGTAT 


GCCTTAACAG 


7750 


ATTGGATGTT 


CTC6ATCATT 


TCTGAGGTAA 


TCACAAGTCT 


TTCACTGATC 


7800 
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TTCCCAGCTC 


TCTGATCTCT 


GTACTTCATC 


ATCATTCTCC 


CTAGCCCAGC 


7850 


CTGAAAAAGG 


GCAAGGACTA 


TCAGGAAACC 


AAGTCCACAG 


AAGGCAGACG 


7900 


CCTGTAACAA 


CTCCCAGATT 


AGCCCCATGA 


GGAGTGCCAC 


TTGCAAAGGA 


7950 


GCGATCCACA 


CGAAATGTGC 


CAATGCAAGT 


CCTTCATCAA 


ATTTGTTCAG 


8000 


GTTGTTGGAA 


AGGAGACTAA 


CAAGTTGTCC 


AATACTTATT 


TTATCTAGAA 


8050 


CACGGCTTGA 


CAGCTTTAAA 


GTCTTCTTAT 


AAATCAAACT 


AAACATAGCT 


8100 


ATTCTCATCT 


GCATTCCAAT 


GTGATGAAGG 


CCAAAAATGG 


CTGGGTGTAG 


8150 


GAGCAGTGTC 


CTCACAATAA 


AGAGAAGGCA 


TAAGCCTATG 


CCTAGATAAA 


8200 


TCGCGATAGA 


GCGTTCCTCC 


TTGTTATCCG 


GGTCATAGGA 


AGCTATGATT 


8250 


CTTCCCAGTA 


AGAGAGGCTG 


TACTGCTTTG 


GTGACTTCCC 


CTAAATATAA 


8300 


AAAGATTCCA 


TAGAACATAA 


ATCTCCAGAA 


AAAACATCGC 


CGAAGGGCAT 


8350 


TAATGAGTTT 


AGGATTTTTC 


TTTGAAGCCA 


GCTCTCTATC 


CCATTCTCTT 


8400 


TCCAATTTTT 


CAGATAGATT 


GTCAGCAGAA 


TCAACAGAAG 


GGATTTGGTA 


8450 


TATGTCTGAC 


AATTCCAGGC 


GCTGTCTGTA 


TCCTTTCCTC 


AAAATTGGTC 


8500 


TGGTCCAGCT 


GAAAAAAAGT 


TTGGAGACAA 


CGCTGGCCTT 


TTCCAGAGGC 


8550 


GACCTCTGCA 


TGGTCTCTCG 


GGCGCTGGGG 


TCCCTGCTAG 


GGCCGTCTGG 


8600 


GCTCAAGCTC 


CTAATGCCAA 


AGGAATTCCT 


GGAGCCCGGG 


GGATCCACTA 


8650 


GTTCTAGAGC 


GGCCGCCACC 


GCGGTGGCTG 


ATCCCGCTCC 


CGCCCGCCGC 


8700 


GCGCTTCGCT 


TTTTATAGGG 


CCGCCGCCGC 


CGCCGCCTCG 


CCATAAAAGG 


8750 


AAACTTTCGG 


AGCGCGCCGC 


TCTGATTGGC 


TGCCGCCGCA 


CCTCTCCGCC 


8800 


TCGCCCCGCC 

A Www W WW WWW 


CCGCCCCfTCG 


CCCCGCCCCG 


CCCCGCCTGG 


CGCGCGCCCC 


8850 


cccccccccc 


CCGCCCCCAT 


CGCTGCACAA 


AATAATTAAA 


AAATAAATAA 


8900 


ATACAAAATT 


GGGGGTGGGG 


AGGGGGGGGA 


GATGGGGAGA 


GTGAAGCAGA 


8950 


ACGTGGCCTC 


GAGTAGATGT 


ACTGCCAAGT 


AGGAAAGTCC 


CATAAGGTCA 


9000 


TGTACTGGGC 


ATAATGCCAG 


GCGGGCCATT 


TACCGTCATT 


GACGTCAATA 


9050 


GGGGGCGTAC 


TTGGCATATG 


ATACACTTGA 


TGTACTGCCA 


AGTGGGCAGT 


9100 
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TTACCGTAAA 


TACTCCACCC 


ATTGACGTCA 


ATGGAAAGTC 


CCTATTGGCG 


9150 


TTACTATGGG 


AACATACGTC 


ATTATTGACG 


TCAATGGGCG 


GGGGTCGTTG 


9200 


GGCGGTCAGC 


CAGGCGGGCC 


ATTTACCGTA 


AGTTATGTAA 


CGACCTGCAG 


9250 


GCTGATCTCC 


CTAGACAAAT 


ATTACGCGCT 


ATGAGTAACA 


CAAAATTATT 


9300 


CAGATTTCAC 


TTCCTCTTAT 


TCAGTTTTCC 


CGCGAAAATG 


GCCAAATCTT 


9350 


ACTCGGTTAC 


GCCCAAATTT 


ACTACAACAT 


CCGCCTAAAA 


CCGCGCGAAA 


9400 


ATTGTCACTT 


CCTGTGTACA 


CCGGCGCACA 


CCAAAAACGT 


CACTTTTGCC 


9450 


ACATCCGTCG 


CTTACATGTG 


TTCCGCCACA 


CTTGCAACAT 


CACACTTCCG 


9500 


CCACACTACT 


ACGTCACCCG 


CCCCGTTCCC 


ACGCCCCGCG 


CCACGTCACA 


9550 


AACTCCACCC 


CCTCATTATC 


ATATTGGCTT 


CAATCCAAAA 


TAAGGTATAT 


9600 


TATTGATGAT 


GCTAGCATGC 


GCAAATTTAA 


AGCGCTGATA 


TCGATCGCGC 


9650 


GCAGATCTGT 


CATGATGATC 


ATTGCAATTG 


GATCCATATA 


TAGGGCCCGG 


9700 


GTTATAATTA 


CCTCAGGTCG 


ACGTCCCATG 


GCCATTCGAA 


TTCGTAATCA 


9750 


TGGTCATAGC 


TGTTTCCTGT 


GTGAAATTGT 


TATCCGCTCA 


CAATTCCACA 


9800 


CAACATACGA 


GCCGGAAGCA 


TAAAGTGTAA 


AGCCTGGGGT 


GCCTAATGAG 


9850 


TGAGCTAACT 


CACATTAATT 


GCGTTGCGCT 


CACTGCCCGC 


TTTCCAGTCG 


9900 


GGAAACCTGT 


CGTGCCAGCT 


GCATTAATGA 


ATCGGCCAAC 


GCGCGGGGAG 


9950 


AGGCGGTTTG 


CGTATTGGGC 


GC 
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CCAATTCCAT 


CATCAATAAT 


ATACCTTATT 


TTGGATTGAA 


GCCAATATGA 


50 


TAATGA6GGG 


GTGGAGTTTG 


TGACGTGGCG 


CGGGGCGTGG 


GAACGGGGCG 


100 


G6TGACGTAG 


GTTTTAGGGC 


GGAGTAACTT 


GTATGTGTTG 


GGAATTGTAG 


150 


TTTTCTTAAA 


ATGGGAAGTT 


ACGTAACGTG 


GGAAAACGGA 


AGTGACGATT 


200 


TGAGGAAGTT 


GTGGGTTTTT 


TGGCTTTCGT 


TTCTGGGCGT 


AGGTTCGCGT 


250 


GCGGTTTTCT 


GGGTGTTTTT 


TGTGGACTTT 


AACCGTTACG 


TCATTTTTTA 


300 


GTCCTATATA 


TACTCGCTCT 


GCACTTGGCC 


eiTlTTTTACA 


CTGTGACTGA 


350 


TTGAGCTGGT 


GCCGTGTCGA 


GTGGTGTTTT 


TTTAATAGGT 


TTTCTTTTTT 


400 


ACTGGTAAGG 


CTGACTGTTA 


GGCTGCCGCT 


GTGAAGCGCT 


GTATGTTGTT 


450 


CTGGAGCGGG 


AGGGTGCTAT 


TTTGCCTAGG 


CAGGAGGGTT 


TTTCAGGTGT 


500 


TTATGTGTTT 


TTCTCTCCTA 


TTAATTTTGT 


TATACCTCCT 


ATGGGGGCTG 


550 


TAATGTTGTC 


TCTACGCCTG 


CGGGTATGTA 


TTCCCCCCAA 


GCTTGCATGC 


600 


CTGGAGGTCG 


ACTCTAGAGG 


ATCCGAAAAA 


ACCTCCCACA 


CCTCCCCCTG 


650 


AACCTGAAAC 


ATAAAATGAA 


TGCAATTGTT 


GTTGTTAACT 


TGTTTATTGC 


700 


AGCTTATAAT 


GGTTACAAAT 


AAAGCAATAG 


CATCACAAAT 


TTCACAAATA 


750 


AAGCATTTTT 


TTCACTGCAT 


TCTAGTTGTG 


GTTTGTCCAA 


ACTCATCAAT 


800 


GTATCTTATC 


ATGTCTGGAT 


CCCCGCGGCC 


GCTCTAGAAC 


TAGTGGATCC 


850 


CCCGGGCTGC 


AGGAATTCCG 


TAACATAACT 


GCGTGCTTTA 


TTGAGATACA 


900 


CAGTAAAGCA 


GTAATATAAT 


ACAATAGTAA 


GGCATATATT 


TGGTGAAATC 


950 


TGATATGTTG 


TGAAAATGGA 


GTAAAACTGA 


% willing % % * n 

AGTTTAAAAA 


AAT AA 11 A G 1 




AAATGTTACA 


GTGTTGGTGT 


TAAAACACAA 


TCTATTATGA 


TACTCAAGTA 


1050 


AGAGTCCAGT 


ACCTGGAGAC 


AATGATGATA 


CATGCCATGT 


GATGATTATG 


1100 


CTTCAGTTAC 


ACTGATTATG 


ATTTACACTT 


TAATACTTGA 


TGGTTATAAA 


1150 


GAACATGAAA 


TGATGTCCAA 


ATTATGCTTA 


AAATCAGCAA 


TAAAGCTCTC 


1200 


AGTTTTTATT 


CAAATATTTT 


GATAGATTCA 


CTCCAGAACT 


AATATCTAAA 


1250 
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AGATAAAACG 


AAAAGATTAA 


AACAAAACTA 


T6CACTCTAT 


CTACCTTGGA 


1300 


TTTTAGAATG 


AAACTTAAAA 


CTTCTTAGTA 


66AAA66AAC 


CCCTTGTTTT 


1350 


AAATCTTGGT 


GAAAACAAAT 


CCTTGGATAA 


AGAAAAT6CC 


CAGTGCCACA 


1400 


TAAAGGAGAG 


AGAGAGAGAA 


AAGCAAGACC 


AGAACCAAAT 


TTCAATTTGT 


1450 


TATCTTAGAG 


CTTTGGGTTT 


TCTTTTGGAA 


ATTATAAATG 


AAAAAAGGAA 


1500 


ACTGGTGTCC 


ACACAACAGA 


CAAGTGGTGA 


AGTTGTGAAA 


TTAGGTGTGC 


1550 


ACAATTACTA 


GAAACACCCC 


AAAACCAAAG 


TGAGGTAGAA 


ATAGCATGAG 


1600 


AAGCTGTGTT 


TGATGTTAAT 


TACAATTAAT 


AATGGACAAA 


ACCCACTCGC 


1650 


TAGAAGTTAA 


TTACACTTGA 


CGTTAGAGGT 


AACAGATTTG 


CAAAATGATA 


1700 


GGACAGTGAT 


TTCTATTGAG 


AGAATGCTCT 


TTAAATGCTA 


AGAAGAAGAA 


1750 


ACTGGCATGA 


GAGGAGTAAA 


GCTCTTCCTA 


GCAGTCCTTA 


GCTTTCTGTT 


1800 


GCACTTTTTC 


TCCTGGTTCA 


ATGACTTGCA 


TTTGTTTAGA 


CATTTCAGCC 


1850 


CGTCAACTAG 


ACGAGAGAGT 


TTGGAGACGC 


TTTTGCTCTC 


AAAACTTTCC 


1900 


AACCACTGTG 


CCTTCTCACC 


CACAATCCTG 


TGTGGAGTTA 


CTTGCAGGGA 


1950 


AACCAATGCA 


AAGGAGACAA 


ATGCAGTTCA 


TGGGCTTCTG 


GACTGATATT 


2000 


CACCAGGGTC 


ACAATGTGAT 


TGGGTTACTT 


TCTTAACAGT 


AATCCTAAGT 


2050 


CTTGCAGCAT 


TAAAAAAAAA 


AATCATCACA 


ATGAAGAAAA 


AAAAACCCAA 


2100 


AAAATCTAAA 


ATCTAAAATT 


CATCATCATC 


ATCAACAACA 


ACAACAACAA 


2150 


CAACAACAAA 


ACCACCCACT 


TCAGGTTGAG 


TTTATGAAGA 


GGGCAGAACA 


2200 


AX X lAuilui 


AAX XAXAuAu 


AIuTjL 1A1AI 


uXAi AG X X u X 


AAAXAX lUAi 




CCATTCTTTT 


ACAGAGTTGT 


TGCTCCCCTC 


ATATAAATTG 


ACTGAGGAGC 


2300 


CGCAACCTTT 


AGCTCCTACC 


ATCTTCCTCC 


TACTGTCTGG 


GAGTTAAAAA 


2350 


TGTCATCTGA 


TGTTCTATTG 


CAGAAACATC 


ATTAAATATA 


ACCCAACAGT 


2400 


AGGAAGTTGA 


ATATATCAGC 


CAACAAATTA 


CTATGATAGT 


AAGTCCTGTG 


2450 


TATTCATTCG 


CATGTTCCTT 


GAAAAAAAT6 


AATCCTCTAG 


CTCTCAGTGG 


2500 
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AAAGTTTAAA 


ACTAGAAACA 


TCTGGAGCCC 


TAGACAATAT 


TTTAGTGTGG 


2550 


CGGTAGTCTC 


CTGGCTTTGG 


GCTCCAGGGA 


AAATTCACTC 


TTGCCCAAGC 


2600 


AGATAAGCCC 


AGATGACTAG 


AAGCAATTTC 


CATAGGAAG 


TGGGAAGAAC 


2650 


ATTTGAAGAA 


GTAACTTCAT 


ATCTATTTAT 


CTATATACCT 


ATAGTATTTA 


2700 


TATACTTGTA 


GACATATAGA 


TGTATAAAAT 


GAAAGCCCAT 


AGCGAGCCCC 


2750 


ACTCAGTCAA 


CAATTCTCAA 


AAGAGCAATA 


TGAAGCAGTC 


ATTTGGTGGG 


2800 


GTTCGTATGC 


AAGAAAATAA 


AAAAACGTCA 


TGAATTCCAT 


ATGAATACCA 


2850 


CGCTAAAGTA 


ATGCAAAACA 


ATGTGCTGCC 


TCAGTGTGTG 


TGTGTGTGTG 


2900 


TGTGTGTGTG 


GTGGGTTCGT 


GCATGTATGT 


GTGCGTGTGT 


GTGTGTGTGT 


2950 


GTGTGTGTGT 


GTGTGTGTGC 


GTGTGTGTTT 


GTTTAGGGGT 


TTTTATAAAC 


3000 


AACTTTTTTT 


ATAAAGCACA 


CTTTAGTTTA 


CAATCTCTCT 


TTATAACTGT 


3050 


TATAAATTTT 


TAAACAACCC 


AAAATGCGTT 


CCATATAAAG 


AAATGGGAAG 


3100 


TTATTTAGCT 


ATCAAGATTT 


TACATGTTTT 


CTTTTAACTT 


TTTTGTACAA 


3150 


TTGCATAGAC 


GTGTAAAACC 


TGCCATTGTT 


AACAAAACAA 


TAACAGACTT 


3200 


AGAAACTACT 


GAAATCTACA 


GTATAGTACC 


ACTACCCTTC 


ACAAAAATAT 


3250 


AGATTTTATT 


TCTTGTAAAC 


TCTTACTGTC 


TAATCCTCTT 


TGTTGTACGA 


3300 


ATATTATAAA 


AACCATGCGG 


GAATCAGGAG 


TTGTAAAACA 


TTTATTCTGC 


3350 


TCCTTCTTCA 


TCTGTCATGA 


CTGAAACTAA 


GGACTCCATC 


GCTCTGCCCA 


3400 


AATCATCTGC 


CATGTGGAAA 


AGGCTTCCTA 


CATTGTGTCC 


TCTCTCATTG 


3450 


w*V*X X X wWVJVJw' 


V?V3 wXV X liUll 


ccfcttg a a n 

WW A w A A WUlv 


TAGGGAAGGA 


GTTGTTGAGT 


3500 


TGCTCCATCA 


CTTCTTCTAA 


CCCTGTGCTT 


GTGTCCTGGG 


GAGGACTCAG 


3550 


AAGATCTTCC 


TCACCCATAG 


ATTCTGAAGT 


TTGACTGCCA 


ACCACTCGGA 


3600 


GCAGCATAGG 


CTGACTGCTA 


TCTGACCTCT 


GCAGAGAGGT 


GGAAGGAGAG 


3650 


GACACCGTGG 


TGCCATTCAC 


CTTAGCTTCA 


GCCTGGGGCT 


GCTCCAGGAG 


3700 


CTGTCTCAGT 


CTATGTAACT 


GAGACTCCAG 


CTGTTTATTG 


TGGTCTTCCA 


3750 
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GGATTTGCAT 


CCTGGCTTCC 


AGGCGTCCTT 


TGTGTTGGCG 


CAGTAGCTTA 


3800 


GCCTCAGCAA 


TGAGCTCAGC 


ATCCCTGGGA 


CTCTGAGGAG 


AGGTGGG CAT 


3850 


CATCTCAGGA 


GGAGATGGCA 


GTGGAGACAG 


GCCTTTATGC 


TCATGCTGCT 


3900 


GCTTCA6GCG 


ATCATATTCT 


GCTTGCAGAT 


TCCTGTTTTC 


TTCCTCAAGA 


3950 


TCTGCTAGGA 


TTCTCTCTAG 


CTCCCCTCTT 


TCCTCACTCT 


CTAAGGAAAT 


4000 


CAAGATCTGG 


GCAGGACTAC 


GAGGCTGGCT 


CAGGGGGGAG 


TCCTGGTTCA 


4050 


AACTTTGGCA 


GTAATGCTGG 


ATTAACAAAT 


GTTCATCATC 


TATGCTCTCA 


4100 


TTAGGAGAGA 


TGCTATCATT 


TAGATAAGAT 


CCATTGCTGT 


TTTCCATTTC 


4150 


TGCTAGCCTG 


CTAGCATAAT 


GTTCAATGCG 


TGAATGAGTA 


TCATCGTGTG 


4200 


AAAGCTGGGG 


GGACGAGGCA 


GGCGCAGAAT 


CTACTGGCCA 


GAAGTTGATC 


4250 


AGAGTAACGG 


GAGTTTCCAT 


GTTGTCCCCC 


TCTAACACAG 


TCTGCACTGG 


4300 


CAGGTAGCCC 


ATTCGGGGAT 


GCTTCGCAAA ATACCTTTTG 


GTTCGAAATT 


4350 


TGTTTTTTAG 


TACCTTGGCG 


AAGTCGCGAA 


CATCTTCTCC 


GGATGTAGTC 


4400 


GGAGTGCAAT 


ACTCTACCAT 


GGGGTAGTGC 


ATTTTATGGC 


CCTTTGCAAC 


4450 


TCGGCCAGAA 


AAAAAGCAAC 


TTTGGCAGAT 


GTCATAATTA 


AAATGCTTTA 


4500 


GGCTTCTGTA 


CCTGAATCCA 


ATGATTGGAC 


ACTCCTTACA 


GATGTTACAC 


4550 


TTGGCTTGAT 


GCTTGGCAGT 


TTCAGCAGCA 


GCCACTCTGT 


GCAAGACGGG 


4600 


CAGCCACACC 


ATAGACTGGG 


GTTCCAGGCG 


CATCCAGTCA 


AGGAAGAGAG 


4650 


CAGCTTCAAT 


CTCAGGTTTA 


TTATTGGCAA 


ATTGGAAGCA 


GCTCCTGACA 


4700 


CTCGGCTCAA 


TGTTACTGCC 


CCCAAAGGAA 


GCAACTTCAC 


CCAACTGTCT 


4750 


TGGGATTTGA 


ATAGAATCAT 


GCAGAAGAAG ACCCAGCCTA 


CGCTGGTCAC 
Www x ovj x 


4800 


AAAAGCCAGT 


TGAACTTGCC 


ACTTGCTTGA 


AAAGGTATCT 


GTACTTGTCT 


4850 


TCCAAGTGTG 


CTTTACACAG 


AGAAATGATG 


CCAGTTTTAA 


AAGACAGGAC 


4900 


ACGGATCCTC 


CCTGTTCGTC 


CCGTATCATA AACATTGAGA 


AGCCAGTTGA 


4950 


GACAGATATC 


CACACAGAGA 


GGGAGATTGA 


CGAGATTGTT 


GTGCTCTTGC 


5000 


TCCAGACGAT 


CATAAATTGT 


AGTCAAACAG 


TTAATTATCT 


GCAGGATATC 


5050 
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CATGGGCT6G TCATTTTGCT 
CAGCTGACAG GCTCAAGAGA 
AGCTTCATGG CAGTCCTATA 
TAAAGACTGG TAGAGCTCTG 
GGGTCTCGTG GTTGATATAG 
TCCCAGGGAC CCTGAACTGA 
AAAGTCCCTG TGGGCTTCAT 
CCTGTAGAAG CCTCCATCTG 
TAAGGTGAGA GCTGAATGCC 
GACACGATTG ACATTCTCTT 
TGACTTTTTC AAGGTGATCT 
GGCTGCCAGG ATCCCTTGAT 
TTCATCGGCA GCTTCCTGAA 
TTTTTCTCTG CCAATCAGCT 
TTGACCTCTT CAGCCTGCTT 
TTCTTCAGGA GGCAGTTCTC 
CCAAAGGCTG CTCTGTCAGA 
ATTACAGGTT CTTTAGTTTT 
ATTCTGCTTC TGAACTGCTG 
TCAGTTCATC ATCTTTCAGC 
AGATGCAAAC GCTTCCACTG 
GTTGAGAGAC TTTTTCTGAA 
AACGTCTTTG TAACAGGGGT 
ATTTTTTGGC CATTTTCATC 
AATTTCTCCT TGGAGATCTT 
TGGAGTCTTC TAGGAGCTTC 
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TGAGGTTGTG 


CTGGTCCAGG 


TCCAAGCAAA 


GGGCCTTCTG 


CGCGGAGAAC 


CTGACATTAT 


TCATTTTGGG 


GTGGTCCCAA 


TAGGGCACTT 


TGTTTGGTGA 


AGTGGAAAGG 


AAGTGCTGGG 


GCAGCTGTCT 


GACACGGTCC 


GTATTCAGAT 


CTTCCAAAGT 


CAGTGTGGTC 


AGCTGATGTG 


TAAGAGGTGC 


AATTTCTCCC 


TGCAGAGAGT 


CAATGAGGAG 


CACCTCAGCT 


TGGCGCAACT 


GTTCCTGGAG 


TCTTTCAAGA 


GAGCGCAGGT 


TCAATTTGTC 


TCGTAGGAGC 


CGAGTGACAT 


TGGGCTCCTG 


GTAGAGTTTC 


AATATTCTCA 


CAGTCTCCAG 


CAATTCCCTC 


TTGAAGGCCC 


GGAAATCACC 


ACCGATGGGT 


TGTAGCCAAA 


CAAGAAGTTC 


GTCAGAAtTI J. 


GUI "1 \XAAA 1 


GTTCACTCCA 


CTTGAAATTC 


GCTTCATCCG 


AACCTTCCAG 


AAGATTGTGA 


TAGATATCTG 


GCCATGGTTT 


CATCAGCTCT 


TCCTTACGGG 


AAGCGTCCTG 



GCATCACATG 


5100 


GAGCCTTCTG 


5150 


TCAGGTCAGC 


5200 


CAAGTGGTTT 


5250 


GATGGCTCTC 


5300 


ATGCAGGACC 


5350 


TCCACAGCCA 


5400 


GCTGAGGTTA 


5450 


CAAGGTCATT 


5500 


CGAAGTGCCT 


5550 


ATCCCCCACT 


5600 


TGAGGTCCAG 


5650 


GCTTCATCTA 


5700 


CCATTCAGCG 


5750 


TCTGAGCTCT 


5800 


TCTAGTCCTT 


5850 


AGTACTCATG 


5900 


TATGTATATC 


5950 


GCCTGACGGC 


6000 


CTGAAGAGAA 


6050 


GGGACCTAAT 


6100 


ATGTTATCCA 


6150 


GGATCTCAGG 


6200 


TGTGAGTTTC 


6250 


CTGACTCCCC 


6300 


TAGGACATTG 


6350 
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GCAGTTGTTT CTGCTTCCGT 
AGGGAACTGC TGCAGTAATC 
CACTTACTCT TTTATGAATG 
ATCATGTGTA CTTTTCTGGT 
CAGTGCCAAA TCATTTGCCA 
CTTTGGCCAA CTGCTTGGTT 
GTGTGAGGAC CTTCTTTCCA 
GACCTGTTCG GCTTCTTCCT 
ACATTTCATT CAACTGTTGT 
TCCCACTGAA TCTGAATTCT 
TTCTTGATTG CTGGTTTTGT 
CTTCCAATTG GGGGCGTCTC 
TTGATGATCA TTTCATTGAT 
TGATTTTATA ACTCGATCAA 
AAGCTCGGTT GAAGTCTGCC 
GGCATTTCTA GTTTGGAGAT 
CACTAGAGTA ACAGTCTGAC 
GGGCACGGTC AGGCTGCTTT 
GCCTCCCACT CAGACCTCAG 
GCTTGGTTTT TCCTTATACA 
CATCCGCTTG TTTACCGTGA 
GGTCCTGCCT GACTTGGTTG 
AGAGACCCAC AGAAGCAGGT 
CTTTTAAGTG AACCTCAAGC 
ACCTTTATCC ACTGGAGATT 
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FIGURE 12F 

AATCGAGGAA AGAAACTTCT 
TATGAGTTTC TTCCAAAGCA 
TTTCCCCAAG AAGTATTGAT 
ATCATCAGCA GAATAGTCCC 
CGTCTACACT TATCTGCCGT 
TCTGTGATCT TCTTTTGGAT 
TGAGTCAAGC TTGCCTCTGA 
TAGCTTCCAG CCATTGTGTT 
CTCCTGTTCT GCAGCTGTTC 
TTCAATTCGA TCAGTAATGA 
TTTTCAAATT CTGGGCAGCA 
TGTTCCAAAT CTTGCAGTGT 
GTCTTCCAGA TCACCCACCA 
GCAGAGACAG CCAGTCTGTA 
AGTGCAGGTA CCTCCAACAG 
GACAGTTTCC TTAGTAACCA 
TGGCAGAGGC TCCAGTAGTG 
GTCCTCAGCT CCCGAAGTAA 
ATCTTCTAAC TTCCTCTTCA 
AATGCTGCCC TTTCGACAAA 
ACTGTTACTT CAATCTCCTT 
GTTATAAATT TCCAACTGGT 
GATCCAGCTG CTCTTCAAGC 
TCTCCTTGTT TCTCAGGTAA 
TGTCTGTTTG AGCTTCTTTT 



CCAGGTCCAG 


6400 


GCCTCTTGCT 


6450 


ATTCTCTGTT 


6500 


GAAGAAGTTT 


6550 


TGACGGAGGT 


6600 


TGCATCTACT 


6650 


CCTGTCCTAT 


6700 


GAATCCTTTA 


6750 


TTGAACCTCA 


6800 


TTGTTCTAGC 


6850 


GTAATGAGTT 


6900 


TGCCTTCTGT 


6950 


TCACTCTCTG 


7000 


AGTTCTGTCC 


7050 


CAAAGAAGAT 


7100 


CAGATTGTGT 


7150 


CTCAGTCCAG 


7200 


ATGGTTTACA 


7250 


CTGGCTGAGT 


7300 


AGCCTTTCCA 


7350 


TATGTCAAAC 


7400 


TTCTAATAGG 


7450 


TGCCTAAAAT 


7500 


AGCTCTGGAG 


7550 


CAAGTTTATC 


7600 
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TTGCTCTTCT 


GGCCTTATGG 


GAGCACTTAC 


AAGTACTGCT 


CCTCCTGTTT 


7650 


CATTTAATTG 


TTTTAGAATT 


CCCTGGCGCA 


GGGGCAACTC 


TTCTGCCAGT 


7700 


AACTT6ACTT 


GTTCAAGTTG 


TTCTTTTAGC 


TGCTGCTCAT 


CTCCAAGTGG 


7750 


A6TAATA6CA 


ATGTTATCTG 


CTTCTTCCAG 


CCACAAAACA 


AATTCATTTA 


7800 


AATCTCTTTC 


AAATTCTGAC 


AAGACATTCT 


TTTGTTCTTC 


AATCCTCTTT 


7850 


CTCCTTTCTG 


CCAGCTCTTT 


GCAGATGTCG 


TGCCACCGCA 


GACTCAAGCT 


7900 


TCCTAATTTT 


TCTTGTAGAA 


TATTGACATC 


TGTTTTTGAA 


GACTGTTGAA 


7950 


TTATTTCTTC 


CCCAGTTGCA 


TTCAGTGTTC 


TGAGAAGAGC 


TTGACGCTGC 


8000 


CCAATGCCAT 


CCTGGAGTTC 


CTTAAGATAC 


CATTTGTATT 


TAGCATGTTC 


8050 


CCAGTTTTCA 


GGATTTTGTG 


TCTTTTTGAA 


AAACTGTTCA 


ACTTCATTCA 


8100 


GCCATTGATT 


AAATACCTTC 


ATATCATAAT 


GAAAGTGTCG 


CCATTTTTCA 


8150 


ACTGATCTGT 


CGAATCGCCC 


TTGTCGTTCC 


TTGTACATTC 


TATGAAGTTT 


8200 


TTCCCCCTGG 


AAATCCATCT 


GTGCCACGGC 


TTCCTGTACT 


TTCACCTTTT 


8250 


CCATGGAGGT 


GGGACTTTGC 


AAGGCTGCTG 


TCTTCTTCTT 


GTGAATAATA 


8300 


TCAATCCGAC 


CTGAGATTTG 


TTGCAAATTG 


TCTTTTATAT 


TCTTAAGAGA 


8350 


CTCCTCTTGC 


TTAAAAAGAT 


CTTCAAAATC 


TTTAGCACAG 


AGTTCAGGAG 


8400 


TATTTAGAAG 


ATGATCAACT 


TCTGAAAGAG 


CTTGTAAGAT 


ATGACTGATC 


8450 


TCGGTCAAAT 


AAGTAGAAGG 


GACATAAGAA 


ACATCCAAAG 


GCATATCTTC 


8500 


AGTCGTCACT 


ACCATAGTTT 


CTTCATGGAG 


AGTGTGAATT 


TGTGGAAAGT 


8550 


TGAGTCTTCG 


AAACTGAGCA 


AAATTGCTCT 


CAATTTGCCG 


CCAGCGCTTG 


8600 




TCTGAGTTGG 


CTCCACTGCC 


ATTGCGGCCC 


CATTCTCAGA 


8650 


CAAGCCCTCA 


GCTTGCCTGC 


GCACTGCATT 


CAGCTCCTCT 


TTCTTCTTCT 


8700 


GCAATTCACG 


ATCAATTTCC 


TTTAATTTTC 


TTTCATCTCT 


GGGTTCAGGT 


8750 


AGGCTGGCTA 


AriTrrrrrc 


AATTTCATCC 


AAGCATTTCA 


GGAGATCATC 


8800 


AGCCTGCCTC 


TTGTACTGAT 


ACCACTGGTG 


AGAAATTTCT 


AGGGCCTTTT 


8850 



SOBSniOTE SHEET (HOE 28) 



WO 96/13597 
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TTCTTCTTTG AGACCTCAAA 
AGCTGCTGTT TTATCTTTAT 
TTGTTGTAAG TTGTCTCCTC 
TGTCTTCACT CATATCTTTA 
TGCTGAATTT CAGCCTCCAG 
AAACTGCTCC AATTCCTTCA 
TGTGAGAAAT AGCTGCAAAT 
ACTACTTTCC TGCAGTGGTC 
GTCACGTGTG GAGTCCACCT 
AACGCTTAAG AATGTCTTCC 
TCTAAAAGTT CATCTGCATG 
CTGATCAAAG GTTTCCATGT 
ATTCTTCTAC TCTGGAGGTG 
AGTTTATCTT CTACCAAGGT 
CTCTCCTAAT TCTGTAACAC 
CTTTTTGAGT AGCCTTTCCC 
ATTCCTTCAA CTGCTGATCT 
CCATTCTGTT AAGACATTCA 
AGCATTTCTC CAACTGTTGC 
TTGTAATGCA ATTTCAAAGC 
TTCTGTCTGC TTTTTCTGTA 
CCACTTGAGA CTTGACTTCA 
CTTAGTTGTG ACTGAATTAC 
AGGGAAATGC ATCTTGACTT 
GTTGTTCAAA ATTGGCTGGT 
TCTTGTAATT TTTTCTGTGC 
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FIGURE 12H 

TCCTTGAGAG CATTATGTTT 
TTCCTCTCGC TTTCTCTCAT 
TTTGCAACAA TTCATTTACA 
TTGAAGTCTT CCTCTTTCAG 
TGGTTCAAGC AATTTTTGTA 
AAGGAATGGA GGCCTTTCCA 
CGACGGTTGA GCTCAGAGAT 
ACCGCGGTTT GCCATCAATT 
TTGGGCGCAT GTCATTCATT 
TTTTGTTGTG GTTTCTTCTT 
AATGATCCAC TTTGTGATTT 
GTTTCTGGTA TTCCAACAAA 
ACAGCTATCC AGTTACTGTT 
TTCTTTCTTG CCCAACACCA 
TCTTCAAGTG AGCCTTCTGT 
CAGGCAACTT CAGAATCCAA 
CTTCGTCAAT TCTGTATCTG 
TTTCCTTTCT CATCTTACGG 
TTTCTCTCTG TTACCTTCGC 
TGTTACTCGT TCATCAAGCT 
CAATTTGACG TCCGGTTTTA 
CTCAGGCTTT TATACAAGTT 
TTCCTGTTCA ACACTCTTGG 
CATCTAAAAT CATCTTACTT 
TTTTGGAATA ATCGAAATTT 
AACATCAATT TGTGAAAGAA 



TGTCTGTAAC 


8900 


CTGTGATTCT 


A #t C A 

8950 


GTACCCTCAT 


9000 


ATTCACCCCC 


9050 


TATCTGAGTT 


9100 


GTCTTAATTC 


9150 


TTGGGGCTCT 


9200 


TTGCTGCTTG 


9250 


TCAGCCTTTA 


9300 


TTCAGACTCA 


9350 


GTTCTATGTT 


9400 


AGATTTAGCC 


9450 


CAGAAGACTC 


9500 


TTTTCAAAGA 


9550 


TTCTCAATCT 


9600 


ATTACTTGGC 


9650 


TTGCTGCCAG 


9700 


GACAACTTCA 


9750 


ACCCAACTCA 


9800 


CTTTGGGATT 


9850 


ATCACCATTT 


9900 


CACACAATGA 


9950 


TTTCCAATGC 


10000 


TCCTCTAGAC 


10050 


CATGGAGACA 


10100 


CCCTTTGGTT 


10150 
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GGCATCCTTC 


CCCTGGTTAT 


GTTTCTTCAT 


TTCTTCTAAA 


CTTATCTCAT 


10200 


GACTTGTCAA 


ATCTGATTGG 


ATTTTCTGGG 


CTTCCTGAGG 


CATTTGAGCT 


10250 


GCATCCACCT 


TGTCAGTGAT 


ATAAGCTGCC 


AACTGCTTGT 


CAATGAATTC 


10300 


AA6C6ACTCC 


TGAATTAAGT 


GCAAGGACTT 


TTCAATTTCC 


TGGGCAGACT 


10350 


GGATACTCTG 


TTCAAGCAAC 


TTTTGTTTCC 


TCACAGCCTC 


TTCATGTAGT 


10400 


TCCCTCCAAC 


GAGAATTAAA 


CGTCTCAAGC 


TCCTCATTGA 


TCAGTTCATC 


10450 


CATGACTCCT 


CCATCTGTAA 


GAGTCTGTGC 


CAATAGACGA 


ATCTGATTTG 


10500 


GGTTCTCCTC 


TGAATGATGC 


ATCAGATTTT 


CAAGAGATTC 


TAGCACTTCA 


10550 


GTGATTTCCT 


CAGGTCCTGC 


AGGAACATTT 


TCCATGGTTT 


TAAGTTTCAA 


10600 


TTCTACTTCA 


TTGAGCCACT 


TGTTTGCTTT 


CTCTAAATAT 


GACAATAACT 


10650 


CATGCCAACA 


TGCCCAAACT 


TCTTCCAAAG 


TTTTGCATTT 


TCCATTCAGC 


10700 


CTGGTGCACA 


GCCATTGGTA 


GTTGGTGGTC 


AGAGTTTCAA 


GTTCCTTTTT 


10750 


TAAGGCCTCT 


TGTGCTGAGG 


GTGGAGCGTG 


AGCTATTACA 


CTATTTACAG 


10800 


TCTCAGTAAG 


GAGTTTCACT 


TTAGTTTCTT 


TTTGTAGTGC 


CTCTTCTTTA 


10850 


GCTCTCTTCA 


TTTCTTCAAC 


AGCAGTCTGT 


AATTCATCTG 


GAGTTTTATA 


10900 


TTCAAAATCT 


CTCTCTAGAT 


ATTCTTCTTC 


AGCTTGTGTC 


ATCCACTCAT 


10950 


GCATCTCTGA 


TAGATCTTTT 


TGGAGGCTTA 


CGGTTTTATC 


CAAACCTGCC 


11000 


TTTAAGGCTT 


CCTTTCTGGT 


GTAGACCTGG 


CGGCATATGT 


GATCCCACTG 


11050 


AGTGTTAAGC 


TCTCTAAGTT 


CTGTCTCCAG 


TCTGGATGCA 


AACTCAAGTT 


11100 


CAu vi 1 Uit x 


Ull lAltl It 




CATTAACACT 


ATTTAAACTG 


11150 


GGCTGAATTG 


TTTGAATATC 


ACCAACTAAA 


AGTCTGCATT 


GTTTGAGCTG 


11200 


TTTTTTCAGG 


ATTTCAGCAT 


CCCCCAGGGC 


AGGCCATTCC 


TCTTTCAGGA 


11250 


AAACATCAAC 


TTCAGCCATC 


CATTTCTGTA 


AGGTTTTTAT 


GTGATTCTGA 


11300 


AATTTTCGAA 


GTTTATTCAT 


ATGTTCTTCT 


AGCTTTTGGC 


AGCTTTCCAC 


11350 


CAACTGGGAG 


GAAAGTTTCT 


TCCAGTGCCC 


CTCAATCTCT 


TCAAATTCTG 


11400 



snsinuiESKEr(HiE20 



WOL96/13597 
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ACAGATATTT CTGGCATATT 
ACAGTGTCAC TCAGATAGTT 
TTGCAGAGCC TGTAATTTCC 
TAACACTAAG ATAAGGTACA 
GTCCTGATGC TACTCATTGT 
AAAAATTGTC TGTAGCTCTT 
GGTTAAAATG ATTAGTAAAG 
CCCTGTCCCT TTTCTTTCAG 
TTGAGGCTGA AGAGCTGACA 
ACTGGCTTTT AATTGCTGTT 
AACAAGTTTT CGGCAGTAGT 
ATAAAAGGTA ATGATGTTGG 
TCAGCAATTG GCAGAATTCT 
TGTCTGATAC TTTCAGCATT 
GGCCTGAGCT GATCTGCTGG 
TTTCTCGTGC TATGGCATTG 
TCTTTTCGAT AGACTGCAAA 
AGTAATCCAA CTGTGAAGTT 
GTTCAGAATC CACAGTTATC 
AGTTCCTCTT GGG CATGTTT 
AGTTACCGTT TCCATTACAG 
TGACAGCCTG TGAAATTTGT 
TTGTCCCAAC GTTGTGCAAA 
CACTGACTTA TTTTTCAGTG 
GTTTTTCCAT GGTTGGCTTT 
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FIGURE 


12J 


TCTGAAGGTG 


CTTTCTTGGC 


GAAGCCATTT 


TGTTGCTCTT 


CGAGTCTCTC 


CTCCATTATT 


GAGAGTTTGC 


TTTCTGACTG 


CTCCTGATAG 


CGCATTGGTG 


TCTCTTTGGC 


CCTCACACCA 


GCCACAAAGT 


CTGCATCCAG 


TTGTAGACTC 


TGAATTTTTA 


ATCTGTTGAC 


TTCATCCTTA 


GGCTCTGATA 


GGGTGGTAGA 


TGTCATCTGT 


TCCAATTGTT 


TTTGATACTC 


TAGCCAGTTA 


GTCCACCGGC 


TGTTCAGTTG 


AACACCCTCA 


TTTGCCATCT 


CATCTTGCAG 


TTTTCTGAAC 


ACTTTTTCTT 


GCAAGTCTGA 


TTCAGAACTC 


TGTAATACAG 


CAGTTATATC 


GACATCCAAC 


TGCCTCTTCT 


TTTGAGGAGG 




lul itttlilj 


TTGTCTGTGT 


TAGGGATGGT 


GCTGAACTCT 


TTTCAAGTTT 


GTTTTCCATC 


CAGATTTCCA 


CCGAAAGTAG 


ATCTTGATTG 


TTCTTTTCTA 


GATCTATTTT 



CATCTCCTTC 


11450 


TCAAAGAACT 


11500 


TCATATTCAG 


11550 


CTGGATCCAC 


11600 


GTAAAGTGTC 


11650 


TCAAAGATGT 


11700 


AAACATTGGC 


11750 


ATTGCTCAAT 


11800 


CAAATTTTTA 


11850 


CTGGGTTTTC 


11900 


GTAGCTGATT 


11950 


ACTCTCTCAC 


12000 


TTCTGAAGCT 


12050 


GTTCCACCAG 


12100 


TTCTCTGCTT 


12150 


GATGTTGCCT 


12200 


CTTCTGAACG 


12250 


CTTTTCCTGA 


12300 


TGGTGGTGGA 


12350 


TGGTCACCAT 


12400 


TGAGTGGTGG 


12450 


TTGGGTTAAA 


12500 


TCTTTTGAGT 


12550 


AGTGAACTTA 


12600 


TAAAGTAGAT 


12650 



SBSTITOTE SHEET (MB£ 26) 



WO 96/13597 



ATTTTGTGAA GACTT6ACAT 
CTGAATGTTC TTCATTGCAT 
GGCACTGTTC TTCAGTAAAA 
ACAATCCAGC GGTCTTCAGT 
CAGTACCTTA AGTTGTTCTT 
ATTCATCAAC CACTACTACC 
TCCTGTTCTA GATCTTCTTG 
TAGATCTTCA AGATCAGGTC 
TCTCTTCAGT TTTTGTTAAC 
TGGAGATCGA TTAGAACTTT 
TACCCTGAGA CATTCCCATC 
CTTCAGCTTC TTCATCTTCT 
CCTAACTGTA GAACATTACC 
CATGAATCCC TCATGAGCAT 
TTGAAATCTC TCCTTGTGCT 
GAAAGTACTT CTTCTAAAGC 
CTCCATCAAT GAACTGTCAA 
GTGAAGGATA GGGGCTCTGT 
TGTGTGAAGG CATAACTCTT 
TTCATAGCCC TGTGCTAGAC 
GGTGATGTAA TTGAAAATGT 
GGCAACATTT CCACTTCTTG 
AACTTGAAAG AGTGATGTGA 
AAGTGGTAGC AACATCTTCA 
CATTTTGCAA TGTTGAAGGC 
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FIGURE 12K 

CATTTCATTT TGATCTTTAA 
CTTCTTTTTC TGAAAGCCAT 
TGCTGCCATT TTAGAAGAAT 
CCATCTGCAG ATATTTGCCC 
CGAAAGCAGC TGTTGCATGA 
ATGTGAGTGA GCGAGTTGAC 
AAGCACCTTA TGTTGTTGTA 
CAAAGGGCTC TTCCTCCATT 
CAGTCATCTA GTTCTTTTAA 
GTGTAATTTG CTTTGTTTTT 
TTGAATTTAG GAGATTCATT 
GATAATTTCC CTTTTCCAAC 
AACAAGTCCT TGATGAGATG 
GAAACTGTTC TTTCACTTCT 
CGCAATGTAT CCTCGGCAGA 
AGTTTGGTAA CTATCCAGAT 
GTGACTTGTC TCTGGGAGCT 
GTGGAATCAG AGGTGGGAAC 
GAATCGAGGC TTAGGAGATG 
TGACTGTGAT CTGTTGAGAG 
TCTTCTCTAG TTACTTTTGA 
AATGGCTTCA ATGCTCACTT 
TGTACATTAA GATGGACTTC 
GGATCAAGAA GTTTTTCTAT 
ATGTTCCAGT CTTTGGGTGG 
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AGCCACTT6T 


12700 


GTACTAAAAA 


12750 


ATCTTGTAAA 


12800 


ATCGATCTCC 


12850 


TCACCGCTGG 


12900 


CCTGACCTGC 


12950 


CTTGGCATTT 


13000 


TTCTTAGTTC 


13050 


TTTCTGATTC 


13100 


CCATGCTAGC 


13150 


TGTTCTTGCA 


13200 


TAGTTGACTT 


13250 


TCAGATCCAT 


13300 


TCAACATCAT 


13350 


AAGAAGCCAT 


13400 


TTACTTCCGT 


13450 


TCCAAATGCT 


13500 


ATAAGCAGCC 


13550 


AAGAAGTTTG 


13600 


TAATGCATCT 


13650 


AGATGTCCTG 


13700 


GTTGTGGCAA 


13750 


TTGTCTGGAT 


13800 


GCCTAACTGG 


13850 


CTGAGTGCTG 


13900 



mmuiiaiiti (HUB) 



WO 96/13597 



TGAAACCACA CTATTCCAAT 
6AGCATTCAA AGCCAACCCG 
TTAACCTGTG GATAATTACG 
CTTTTCACTG TTGGTTTGCT 
TTTTGACCTG CCAGTGGAGG 
TTATGATTTC CATCCACTAT 
ATTATTTTTC TGTAAGACCC 
GAACTCTTGT AGATCCCTTT 
TCCAAGAGGT CTAGGAGGCG 
GTCTATGTGT TGCTTTCCAA 
TGAATGTTTT CTTTTGAACA 
TCCCACCAAA GCATTTGGAA 
ATCTTGGTAA AAGTTTCTCC 
GATGAGAAGC CAATAAACTT 
TAGCACTTCA AGTCTTCCTA 
AGAGCGGAAT TCCTGCAGCC 
GGGTACAATT CCGCAGCTTT 
TAGAAGTAAA GGCAACATCC 
CACCGGATCC GGGACCTGAA 
TAACTTTCTG GTTTTTCAGT 
AGGCTGGATC GGTCCCGGTG 
GCGTCTCCAG GCGATCTGAC 
CCTCCCACCG TACACGCCTA 
TTACGACATT TTGGAAAGTC 
ATTGACGTCA ATGGGGTGGA 
ATCCACGCCC ATTGATGTAC 
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FIGURE 12L 
CAAACAGGTC GGGCCTGTGA 
TCGGACCAGC TAGAGGTGAA 
TGTTGACTGT CGAACCCAGC 
GCAATCCAGC CT TGATAGTT 
ATTATATTCC AAATCAAACC 
GTCAGTGCTT CCTATATTCA 
GCAGTGCCTT GTTGACATTG 
TCTTTTGGCA GTTTTTGCCC 
TTTTCCATCC TGCAGGTCAC 
ACTTAGAAAA TTGTGCATTT 
TCTTCTCTTT CATAACAGTC 
GAAAAAGTAT ATATCAAGGC 
CAGTTTTATT GCTCCAGGAG 
CAGCAGCCTT GACAAAAAAA 
TTCGTTTTTT CTATAAAGCT 
CGGGGGATCC ACTAGTTCTA 
TAGAGCAGAA GTAACACTTC 
ACTGAGGAGC AGTTCTTTGA 
ATAAAAGACA AAAAGACTAA 
TCCTCGAGTA CCGGATCCTC 
TCTTCTATGG AGGTCAAAAC 
GGTTCACTAA ACGAGCTCTG 
CCGCCCATTT GCGTCAATGG 
CCGTTGATTT TGGTGCCAAA 
GACTTGGAAA TCCCCGTGAG 
TGCCAAAACC GCATCACCAT 
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CTATGGATAA 


13950 


GTTGATGACG 


14000 


TCAGAAGAAT 


14050 


TTCATCACAT 


14100 


AAGAGT6AGT 


14150 


CTAAATCAAC 


14200 


TTCAGGGCAT 


14250 


TGTAAGGCCT 


14300 


TGAAGAGGTT 


14350 


ATCCATTTTG 


14400 


CTCTACTTCT 


14450 


AGGGATAAAA 


14500 


GCTTAGGTAC 


14550 


AAAAAAAAAA 


14600 


ATTGCCTTCA 


14650 


GAGCGGCCGC 


14700 


CGTACAGGCC 


14750 


TTTGCACCAC 


14800 


ACTTACCAGT 


14850 


TAGAGTCCGG 


14900 


AGCGTGGATG 


14950 


CTTATATAGA 


15000 


GGCGGAGTTG 


15050 


ACAAACTCCC 


15100 


TCAAACCGCT 


15150 


GGTAATAGCG 


15200 
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WO 96/13597 



ATGACTAATA CGTAGATGTA 
GTACTGGGCA TAATGCCAGG 
GGGGCGTACT TGGCATATGA 
TACCGTAAAT ACTCCACCCA 
TACTATGGGA ACATACGTCA 
GCGGTCAGCC AGGCGGGCCA 
TCGACTCTAG AGGATCTCCC 
AAAATTATTC AGATTTCACT 
CCAAATCTTA CTCGGTTACG 
CGCGCGAAAA TTGTCACTTC 
ACTTTTGCCA CATCCGTCGC 
ACACTTCCGC CACACTACTA 
CACGTCACAA ACTCCACCCC 
AAGGTATATT ATTGATGATG 
CAATGATCAT CATGACAGAT 
TTGCGCATGC TAGCTATAGT 
GCTACGTATA CTCCGGAATA 
GCCGCCTGCA GCTGGCGCCA 
GTACAGAGCT CGAGAAGTAC 
TTGGCACTGG CCGTCGTTTT 
TACCCAACTT AATCGCCTTG 
ATAGCGAAGA GGCCCGCACC 
AATGGCGAAT GGCGCCTGAT 
TATTTCACAC CGCATACGTC 
CGCATTAAGC GCGGCGGGTG 
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FIGURE 12M 

CTGCCAAGTA GGAAAGTCCC 
CGGGCCATTT ACCGTCATTG 
TACACTTGAT GTACTGCCAA 
TTGACGTCAA TGGAAAGTCC 
TTATTGACGT CAATGGGCGG 
TTTACCGTAA GTTATGTAAC 
TAGACAAATA TTACGCGCTA 
TCCTCTTATT CAGTTTTCCC 
CCCAAATTTA CTACAACATC 
CTGTGTACAC CGGCGCACAC 
TTACATGTGT TCCGCCACAC 
CGTCACCCGC CCCGTTCCCA 
CTCATTATCA TATTGGCTTC 
CTAGCGGGGC CCTATATATG 
CTGCGCGCGA TCGATATCAG 
TCTAGAGGTA CCGGTTGTTA 
TTAATAGGCC TAGGATGCAT 
TCGATACGCG TACGTCGCGA 
TAGTGGCCAC GTGGGCCGTG 
ACAACGTCGT GACTGGGAAA 
CAGCACATCC CCCTTTCGCC 
GATCGCCCTT CCCAACAGTT 
GCGGTATTTT CTCCTTACGC 
AAAGCAACCA TAGTACGCGC 
TGGTGGTTAC GCGCAGCGTG 
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ATAAGGTCAT 


15250 


ACGTCAATAG 


15300 


GTGGGCAGTT 


15350 


CTATTGGCGT 


15400 


GGGTCGTTGG 


15450 


GACCTGCAGG 


15500 


TGAGTAACAC 


15550 


GCGAAAATGG 


15600 


CGCCTAAAAC 


15650 


CAAAAACGTC 


15700 


TTGCAACATC 


15750 


CGCCCCGCGC 


15800 


AATCCAAAAT 


15850 


GATCCAATTG 


15900 


CGCTTTAAAT 


15950 


ACGTTAGCCG 


16000 


ATGGCGGCCG 


16050 


CCGCGGACAT 


16100 


CACCTTAAGC 


16150 


ACCCTGGCGT 


16200 


AGCTGGCGTA 


16250 


GCGCAGCCTG 


16300 


ATCTGTGCGG 


16350 


CCTGTAGCGG 


16400 


ACCGCTACAC 


16450 
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TTGCCAGCGC 


CCTAGCGCCC 


GCTCCTTTCG 


CTTTCTTCCC 


TTCCTTTCTC 


16500 


GCCAC6TTC6 


CCGGCTTTCC 


CCGTCAAGCT 


CTAAATCGGG 


GGCTCCCTTT 


16550 


AGGGTTCCGA 


TTTAGTGCTT 


TACGGCACCT 


CGACCCCAAA 


AAACTTGATT 


16600 


TGGGTGATGG 


TTCACGTAGT 


GGGCCATCGC 


CCTGATAGAC 


GGTTTTTCGC 


16650 


CCTTTGACGT 


TGGAGTCCAC 


GTTCTTTAAT 


AGTGGACTCT 


TGTTCCAAAC 


16700 


TGGAACAACA 


CTCAACCCTA 


TCTCGGGCTA 


TTCTTTTGAT 


TTATAAGGGA 


16750 


TTTTGCCGAT 


TTCGGCCTAT 


TGGTTAAAAA 


ATGAGCTGAT 


TTAACAAAAA 


16800 


TTTAACGCGA 


ATTTTAACAA 


AATATTAACG 


TTTACAATTT 


TATGGTGCAC 


16850 


TCTCAGTACA 


ATCTGCTCTG 


ATGCCGCATA 


GTTAAGCCAG 


CCCCGACACC 


16900 


CGCCAACACC 


CGCTGACGCG 


CCCTGACGGG 


CTTGTCTGCT 


CCCGGCATCC 


16950 


GCTTACAGAC 


AAGCTGTGAC 


CGTCTCCGGG 


AGCTGCATGT 


GTCAGAGGTT 


17000 


TTCACCGTCA 


TCACCGAAAC 


GCGCGAGACG 


AAAGGGCCTC 


GTGATACGCC 


17050 


TATTTTTATA 


GGTTAATGTC 


ATGATAATAA 


TGGTTTCTTA 


GACGTCAGGT 


17100 


GGCACTTTTC 


GGGGAAATGT 


GCGCGGAACC 


CCTATTTGTT 


TATTTTTCTA 


17150 


AATACATTCA 


AATATGTATC 


CGCTCATGAG 


ACAATAACCC 


TGATAAATGC 


17200 


TTCAATAATA 


TTGAAAAAGG 


AAGAGTATGA 


GTATTCAACA 


TTTCCGTGTC 


17250 


GCCCTTATTC 


CCTTTTTTGC 


GGCATTTTGC 


CTTCCTGTTT 


TTGCTCACCC 


17300 


AGAAACGCTG 


GTGAAAGTAA 


AAGATGCTGA 


AGATCAGTTG 


GGTGCACGAG 


17350 


TGGGTTACAT 


CGAACTGGAT 
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